Each shard maintain a persisted version of the collection routing table cache in config.cache.chunks.<nss>. The function used to update this collection is updateShardChunks that is performing sequentially a delete operation followed by an insert operation for every new chunk.
When the number of new chunks to persist is high, this logic could become extremely slow. For instance, with a routing table of 10 millions chunks, it could take up to 45 min to persist the entire routing table.
This is problematic because until all the updated chunks get persisted, they will slow down considerably all the concurrent incremental routing table refreshes. For instance, an incremental refresh for a routing table of 10 millions chunks usually takes on the order of 10 milliseconds, but if the routing table needs still to be persisted to disk this incremental refresh can take up to ~5 secs.
The goal of this ticket is to speedup the logic that persist new chunks in config.cache.chunks.<nss>. One potential solution could be to perform a batch of deletions followed by a batch of insertion.
- is duplicated by
-
SERVER-81016 Improve performance of persisted routing table update
- Closed