Details
-
Task
-
Status: Closed
-
Minor - P4
-
Resolution: Duplicate
-
None
-
None
-
None
Description
User experienced lousy batchInsert performance through mongos. By default, batchInsert is supposed to stop if a document in the batch would trigger an error. In the case of mongos, this means that a single batch gets split up into sequences per shard. For instance, the following would translate into three sub-batches:
- Document A destined for shard1
- Document B destined for shard1
- Document C destined for shard2
- Document D destined for shard1
mongos can peak ahead in the batch and send multiple documents in appearing in sequence, but it can't send D to shard1 until C has been successfully inserted on shard2 – and that, of course, requires shard1 to acknowledge A and B. An absolute worst case would be a batch requiring exactly N distinct insert operations across a bunch of shards, where N is the total size of the batch (this assumes the largest sequence size was 1).
Using the continueOnError option can probably mitigate this issue, since that allows us to insert successive documents without checking for errors.
Attachments
Issue Links
- duplicates
-
DOCS-766 Sharded bulk insert in 2.0/2.2 mongos must be verified by application logic
-
- Closed
-