[SERVER-52667] write_ops_exec::performInserts with retryable writes can lose writes and return an out of order results array Created: 06/Nov/20 Updated: 29/Oct/23 Resolved: 16/Nov/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Participants: | |||||||||
| Description |
|
Found via code inspection in a CR. performInserts takes as input a write_ops::Insert (a vector of documents) and returns a result that contains a vector of errors (results*). It's possible for some earlier writes to contain errors while later ones succeed. However, consider the example where an input batch contains 2 writes: {not yet executed insert, already executed insert}. The output results will be for {already executed insert, not yet executed insert}. This comment demonstrates a scenario where I believe the bug in performInserts can manifest as a wrong response to a user.
Assigned to sharding for a first look as this seems specific to retryable writes. |
| Comments |
| Comment by Githook User [ 16/Nov/20 ] | ||||||||||||||||||||||||
|
Author: {'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}Message: | ||||||||||||||||||||||||
| Comment by Daniel Gottlieb (Inactive) [ 10/Nov/20 ] | ||||||||||||||||||||||||
|
Had a conversation with max.hirschhorn and there might be a bug today, but connecting the dots is a bit tedious and I'm not familiar enough to know if there's something the system does to prevent this. First to quickly summarize the contract bug with performInserts. The batching code roughly looks like this:
Based on the subset of documents where wasAlreadyExecuted returns true, the indexes referring to a document inside inputBatch may not correspond with the index in the results array. In the simplest case, consider a batch of two inserts. The first needs to be processed, but the second was successfully executed and won't be retried: [Op0, Op1]. The output results will be in reverse order: [Result(Op1), Result(Op0)]. Additionally, when the last document in the input was an already executed retryable write, the pending batch to be inserted is never actually inserted. The results array will omit contain items for only the inserted data. But the response will seem as if everything was successful. I believe a case that exercises this bug in sharding is the following:
| ||||||||||||||||||||||||
| Comment by Daniel Gottlieb (Inactive) [ 09/Nov/20 ] | ||||||||||||||||||||||||
|
Saving for posterity, but this comment adds little value. See the following comment for a simplified demonstration of the bug.
| ||||||||||||||||||||||||
| Comment by Max Hirschhorn [ 07/Nov/20 ] | ||||||||||||||||||||||||
What does it mean for a batch to contain both a retryable insert and a non-retryable insert? An insert command contains either non-retryable inserts or retryable inserts (or inserts in a multi-document transaction) depending on the "txnNumber" field being presented in the command arguments. The txnNumber becomes a property of the OperationContext and so a batch of inserts can be executed either only as retryable or only as non-retryable. |