As part of the "Robust DDL" changes (PM-1965), the amount of work which shardCollection does under the critical section has increased from 4.4 to 5.0 and this leads to writes being blocked for longer time than necessary:
- In 4.4: The indexes and chunks are written outside of the write part of the critical section
- In 5.0: We do not free the critical section while creating the indexes and that's what is causing shardCollection to take blocking time proportional to the size of the collection.
For more context, if a collection is being sharded and we discover that that collection is empty (has zero documents), we hold the critical section for the entire duration (index creation, chunks creation, etc.) in order to ensure that no writes come in case we are creating chunks on shards other than the primary.
However, if we discover that the collection has documents, we can only create chunks locally and therefore we can allow writes to proceed because they will always be searching on the primary shard.
As part of this ticket we should do the absolutely minimum amount of work under the critical section, as part of the optimised path.
- is duplicated by
-
SERVER-60609 Sharding collection with UnoptimizedSplitPolicy must not result in long writes unavailability
- Closed