[SERVER-45943] When write operation times out return number of requested/updated Created: 04/Feb/20 Updated: 08/Jan/24 Resolved: 17/Jul/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Katya Kamenieva | Assignee: | David Storch |
| Resolution: | Won't Do | Votes: | 1 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Sprint: | Query 2020-04-06, Query 2020-05-04, Query 2020-05-18, Query 2020-06-01, Query 2020-06-15, Query 2020-06-29, Query 2020-07-27 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Currently when you run update/delete with maxTimeMS and the operation did not finish within the specified timeframe, you get an error: |
| Comments |
| Comment by David Storch [ 17/Jul/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After further discussion with mira.carey@mongodb.com, jeff.yemin, and behackett, we have decided to close this ticket as "Won't Do". The reasoning is that it is hard to achieve accurate information about the progress of multi:true write operations when they are interrupted in sharded clusters. Let's take the example of a multi:true update expected to modify around 1000 documents across 10 shards. The client has assigned a maxTimeMS value of 100ms. Suppose that, for whatever reason, shard 3 takes a long time to process its part of the update. Maybe the load is not balanced correctly, and the operation gets queued. The mongod notices that its time budget has expired and sends a MaxTimeMSExpired error to the mongos. Today this error will not include any indication of how many documents were updated, but let's say we extend the logic on mongod so that it returns the number of documents modified. In this case, the operation was queued rather than being scheduled so we return to mongos that zero documents were modified on shard 3. It is now mongos's responsibility to propagate the error to the client. Let's say that it has heard back from most shards, and therefore knows how many documents were updated on those nodes, but shards 9 and 10 have not replied yet. The mongos will cancel the outstanding operations on shard 9 and shard 10 and then send a MaxTimeMSExpired error to the client without waiting to hear back from shards 9 and 10. In general for interruption scenarios (maxTimeMS expiration, getting killed by killOp or killCursors, interruption due to stepdown, etc.), we want to ensure a prompt error reply and therefore do not block waiting for all outstanding operations to die. Those outstanding operations are killed, but are allowed to get cleaned up asynchronously. This is especially important for maxTimeMS, since it ensures that an error is returned to the client promptly after expiration. We do not wait for a potentially lengthy cancellation process to complete, which the client could perceive as exceeding the maxTimeMS budget by an unreasonable amount. By not waiting for all the replies, however, mongos cannot definitively know how many writes have taken place. It is increasingly important that features added to MongoDB continue to work seamlessly when sharding is enabled. Since support of this feature in sharding is problematic, we have decided not to pursue it for now. Clients who want to ensure all-or-nothing behavior for multi:true write operations should use transactions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 02/Apr/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
behackett rachelle.palmer a few clarifying questions. 1) First, can you confirm that the scope of this ticket should be expanded to include insert? I imagine, as Bernie mentioned above, that this should apply to all three write commands (update, delete, and insert). 2) Second, a clarification on the nature of the request. As kateryna.kamenieva describes above, maxTimeMS expiration will currently result in a "top-level" error response from the server. Namely, it will have ok:0 along with a error message and code, but will have no further information:
If I understand the request correctly, instead we would report the max time expiration using the writeErrors format supported specifically by the write commands. This format allows multiple errors to be reported, and ensures that the n, nModifed, and upserted fields are reported alongside the error info. For a simple example where the timeout expires before any writes are performed, this would look something like this:
To generate the output from the example above, I set a failpoint to ensure maxTimeMS expiration, and applied the following patch to change the error reporting behavior of the server:
3) In addition to changing the error response format from "top-level" to writeErrors as described above, I believe the request is also to ensure that n and nModified reflect the work done by partially completed multi=true update or delete statements. Consider this example:
Here we have a batch delete command with two delete statements, both multi deletes. If the operation runs to completion, then we expect the first statement to delete two documents and the second to delete three. Thus, the command should report success with n:5:
Let's say that the 100ms maxTimeMS setting expires while executing the second statement; the first document matched by the second statement has already been deleted but the remaining two have yet to be deleted. In this case, we expect n to report that three documents are deleted, with the appropriate error information in the writeErrors array:
Can you confirm that I've understood this aspect of the request successfully? It looks like this part of the request is already tracked by related ticket SERVER-15292. Right now, when a multi=true write statement fails for any reason, we do not incorporate any writes it has already performed when reporting n and nModified. From the point of view of the server's implementation, this problem is orthogonal to the fact that maxTimeMS expiration does not currently report errors in the writeErrors format. Also, how to implement a solution for this particular problem is not immediately obvious to me. For these reasons, a clear answer on whether this is in scope for this ticket would be very helpful! 4) Is it acceptable if in some edge cases maxTimeMS expiration is still reported as a top-level error? In particular, this will still happen in the simple implementation if the deadline is reached very high in the code path, before any real processing of the request is done. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nic Cottrell [ 25/Mar/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
behackett It looks like insert and insertOne one accept document + a write concern config document. So, I don't see any way to set maxTimeMS on inserts | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bernie Hackett [ 04/Feb/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Or inserted. I assume the insert command also supports maxTimeMS? |