[SERVER-49546] setFCV to 4.4 should insert range deletion tasks in batches rather than one at a time Created: 16/Jul/20 Updated: 29/Oct/23 Resolved: 04/Aug/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 4.4.0-rc13 |
| Fix Version/s: | 4.4.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Esha Maharishi (Inactive) | Assignee: | Luis Osta (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||
| Sprint: | Sharding 2020-07-27, Sharding 2020-08-10 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 33 | ||||||||||||
| Description |
|
Currently, setFCV to 4.4 on a shard iterates each collection and for each, builds a vector of orphaned ranges. It then inserts range deletion tasks for the orphaned ranges one at a time. On clusters with collections with a huge number of chunks, inserting the range deletion tasks takes a long time (e.g., ~15 minutes per shard for a collection with 100k chunks). We should see if batching the inserts by doing something like this (where the inserts are grouped into batches here) speeds it up. This script creates a cluster with 100k chunks:
For this script to work, we'll have to disable the limit on numInitialChunks in the server:
We also want to distribute the chunks round-robin between the shards so that each shard has many unowned ranges:
Here's an example of timing how long a section of code takes:
|
| Comments |
| Comment by Esha Maharishi (Inactive) [ 30/Sep/20 ] |
|
dmitry.agranat, this particular issue (and You are right some of the upgrade steps in earlier versions also updated many collections, so may have benefitted from a similar optimization. However, I don't think any earlier versions updated many chunks, and that was what caused the real slowdown in this case. |
| Comment by Githook User [ 04/Aug/20 ] |
|
Author: {'name': 'Luis Osta', 'email': 'luis.osta@mongodb.com', 'username': 'LuisOsta'}Message: |
| Comment by Githook User [ 04/Aug/20 ] |
|
Author: {'name': 'Luis Osta', 'email': 'luis.osta@mongodb.com', 'username': 'LuisOsta'}Message: |