[SERVER-42273] Introduce a "force" option to `moveChunk` to allow migrating jumbo chunks Created: 18/Jul/19 Updated: 29/Oct/23 Resolved: 05/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Ratika Gandhi | Assignee: | Janna Golden |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Sprint: | Sharding 2019-09-23, Sharding 2019-10-07, Sharding 2019-10-21, Sharding 2019-11-04, Sharding 2019-11-18 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Description |
|
Currently, if a chunk is larger than 64MB by default or 1GB max, the balancer will mark it as jumbo and will refuse to move it. It is possible to manually issue a moveChunk command and pass the unsupported and undocumented maxChunkSizeBytes parameter, which will override the check for max chunk size, but even with this, given sufficient write load to the chunk being migrated, the memory usage on the donor shard could exceed 500MB in which case migration will still fail. This ticket proposes adding a new forceJumbo option to the moveChunk command in order to allow large chunks to be migrated at the possible expense of blocking writes to the owning collection on the shard in question. The option will have the following deviation from the way it currently operates:
|
| Comments |
| Comment by Janna Golden [ 05/Nov/19 ] | |
|
The following behavior changes were made as a part of this ticket: Changes to moveChunk command: Changes to balancer configuration settings:
If 'attemptToBalanceJumboChunks' is set to true, the balancer will schedule migrations that attempt to move large chunks as long as the chunk is not marked 'jumbo' in config.chunks. A chunk is marked 'jumbo' only after an attempt to split or move a large chunk has failed because of its size or the size of the transfer mods queue. The balancer should not continually try to schedule the migration of a chunk that has failed for either of these reasons previously to avoid the risk of forever scheduling the same migration. A user can run 'clearJumboFlag' so that the balancer with schedule this migration in the future, or they can choose to use the moveChunk command to manually move the chunk. Unlike the new behavior of the moveChunk command above, the donor shard will not enter the critical section early, and if the transfer mods queue (queue of writes that modify any documents being migrated) surpasses 500MB of memory the migration will fail. This is to avoid unintended "down time" in the case a user was unaware that moving a large chunk can cause a long period of time where ops are blocked on this collection. Changes to shard removal: | |
| Comment by Githook User [ 05/Nov/19 ] | |
|
Author: {'username': 'jannaerin', 'email': 'janna.golden@mongodb.com', 'name': 'Janna Golden'}Message: | |
| Comment by Alyson Cabral (Inactive) [ 22/Jul/19 ] | |
|
Yes, I agree with everything you said. But for my clarity, this is less about how big the chunk is and more about the write throughput on the chunk, correct? | |
| Comment by Kaloian Manassiev [ 22/Jul/19 ] | |
|
alyson.cabral, correct. To be more specific here are the trade-offs:
To make sure I understand what you are suggesting - moveChunk as part of shard removal should ignore the "jumbo" flag and not skip jumbo chunks, but if as part of migration it is discovered that the in-memory usage of the change log to the chunks has exceeded 500MB, still fail the migration, which would require manual intervention? This effectively requires a third state of that option, which is something like "forceJumbo But If Chunk Is Not Too Big". | |
| Comment by Alyson Cabral (Inactive) [ 22/Jul/19 ] | |
|
kaloian.manassiev this is most impactful when you enter the critical section early because you're queueing too many writes to that chunk, right? Stopping all writes to the collection. I'd like us to attempt to automatically move the chunk during shard removal and only require the manual move chunk if you need to enter the critical section early. | |
| Comment by Kaloian Manassiev [ 22/Jul/19 ] | |
|
josef.ahmad/alyson.cabral/cailin.nelson, for this proposal to be used, it still requires the moveChunk command to be manually issued with the forceJumbo parameter, which means that shard removal scenarios will still not work only with the balancer (because it will not send that option by default). In order to make remove shard work fully in the presence of jumbo chunks, we can do two things:
I don't particularly like options (2) and (3), because they give opportunity for customers to unknowingly expose themselves to long stalls. Do you think implementing option (1) makes sense with possibly some checkbox to warn/opt-in users to this behaviour with the warning that it may cause stalls? |