[SERVER-76550] Balancer is unable to drain shards with big chunks Created: 26/Apr/23  Updated: 29/Oct/23  Resolved: 27/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.1.1, 6.0.4, 7.0.0-rc0, 6.0.5, 6.2.1, 6.3.0-rc3
Fix Version/s: 6.0.6, 7.0.0-rc1, 6.3.2

Type: Bug Priority: Critical - P2
Reporter: Pierlauro Sciarelli Assignee: Pierlauro Sciarelli
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-71787 Balancer needs to attach forceJumbo t... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0
Sprint: Sharding EMEA 2023-05-01
Participants:
Case:

 Description   

Issue
In case there is at least one huge chunks on a shard being drained, the balancer may end up indefinitely in the following scenario:

  • The migration proceeds for 6 hours before being aborted
  • The same migration is rescheduled

Technical description
When draining a shard, migrations are being scheduled by the balancer with the forceJumbo flag set to true (meaning they can proceed no matter the number of documents to clone) and by passing the whole chunk entry as argument (meaning that the whole chunk must be migrated in one shot).

This is different from the usual balancing behavior that - after the removal of the auto-splitter in 6.0.3 - consists in issuing moveRange commands by only specifying the min bound so that the shard autonomously decides on which key to chop a chunk according to the configured chunk size.



 Comments   
Comment by Githook User [ 28/Apr/23 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-76550 Shards undergoing draining must chop big chunks to move them off
Branch: v7.0
https://github.com/mongodb/mongo/commit/e33d3b834e3c1dec4b5a543991b8630c6e60ed67

Comment by Githook User [ 28/Apr/23 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-76550 Shards undergoing draining must chop big chunks to move them off
Branch: v6.3
https://github.com/mongodb/mongo/commit/41b2c1046d08a4d8c870741cd25d7bf1a28e0374

Comment by Githook User [ 28/Apr/23 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-76550 Shards undergoing draining must chop big chunks to move them off
Branch: v6.0
https://github.com/mongodb/mongo/commit/26b4851a412cc8b9b4a18cdb6cd0f9f642e06aa7

Comment by Githook User [ 27/Apr/23 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-76550 Shards undergoing draining must chop big chunks to move them off
Branch: master
https://github.com/mongodb/mongo/commit/5a643b932ee85c837cca27c9e6db70d8abc0bfe8

Generated at Thu Feb 08 06:32:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.