-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.6.17, 4.2.5, 4.0.17
-
Component/s: Sharding
-
None
-
Query Execution
-
ALL
If an unordered (ordered:false) batch encounters a routing error (specifically StaleShardVersion), the error response returned to the router will contain at least a BSON object of this size for each operation in the batch, which did not get executed:
{ index: 0, code: 63, codeName: \"StaleShardVersion\", errmsg: \"epoch mismatch detected for foo.bar\", errInfo: { ns: \"foo.bar\", vReceived: Timestamp(1, 0), vReceivedEpoch: ObjectId('5e8378bff739365807792086'), vWanted: Timestamp(2, 0), vWantedEpoch: ObjectId('5e8378bff739365807792086'), shardId: \"Shard0001\" } }
This effectively means that if a large bulk insert for example is sent to a shard after chunk migration, the entire write will fail with a BSONObjTooLarge error and the error will be propagated to the client. Furthermore, this is problematic for the $out stage, which uses batch sizes of 100,000 and is susceptible to this problem.
This issue will be worked around under SERVER-46981, so it is not an urgent problem. This ticket is about improving the unordered write error responses to not be proportional to the size of the input batch.
- is related to
-
SERVER-46981 The MongoS write commands scheduler does not account for the potential size of the response BSON
- Closed