Details
-
Bug
-
Resolution: Unresolved
-
Major - P3
-
None
-
3.6.17, 4.2.5, 4.0.17
-
None
-
Query Execution
-
ALL
Description
If an unordered (ordered:false) batch encounters a routing error (specifically StaleShardVersion), the error response returned to the router will contain at least a BSON object of this size for each operation in the batch, which did not get executed:
{ index: 0,
|
code: 63,
|
codeName: \"StaleShardVersion\",
|
errmsg: \"epoch mismatch detected for foo.bar\",
|
errInfo: { ns: \"foo.bar\",
|
vReceived: Timestamp(1, 0), vReceivedEpoch: ObjectId('5e8378bff739365807792086'),
|
vWanted: Timestamp(2, 0), vWantedEpoch: ObjectId('5e8378bff739365807792086'),
|
shardId: \"Shard0001\" } }
|
This effectively means that if a large bulk insert for example is sent to a shard after chunk migration, the entire write will fail with a BSONObjTooLarge error and the error will be propagated to the client. Furthermore, this is problematic for the $out stage, which uses batch sizes of 100,000 and is susceptible to this problem.
This issue will be worked around under SERVER-46981, so it is not an urgent problem. This ticket is about improving the unordered write error responses to not be proportional to the size of the input batch.
Attachments
Issue Links
- is related to
-
SERVER-46981 The MongoS write commands scheduler does not account for the potential size of the response BSON
-
- Closed
-