[SERVER-42914] Implement random chunk selection policy for balancer for use in concurrency_*_with_balancer workloads Created: 20/Aug/19  Updated: 29/Oct/23  Resolved: 28/Aug/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.6.16, 4.3.1, 4.2.3, 4.0.14

Type: Task Priority: Major - P3
Reporter: Alexander Taskov (Inactive) Assignee: Alexander Taskov (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-32692 Make zbigMapReduce.js, sharding_balan... Closed
is depended on by SERVER-43099 Reenable random chunk migration failp... Closed
Related
related to SERVER-40713 Enable fsm workloads that use moveChu... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.2, v4.0, v3.6
Sprint: Sharding 2019-08-26, Sharding 2019-09-09
Participants:
Linked BF Score: 19

 Description   

Currently, in the concurrency_*_with_balancer suites, even if the balancer is on, it may not move chunks if there is not a data imbalance across shards. This means that we are not getting as much coverage as we would like for migrations concurrent with other operations.

To enable our concurrency tests to have better coverage of migrations during stepdowns, we would like the balancer to cause more frequent migrations. Adding a failpoint which causes the balancer to always move a random chunk (ignoring the current data distribution) will ensure that the balancer is consistently moving chunks throughout the execution of the concurrency workload.

The alternative in the past has been to use the moveChunk command in one of the state transitions of the workload - however, moveChunk operations are currently banned for the with_stepdowns suites (which will be addressed in the future in SERVER-40713).



 Comments   
Comment by Githook User [ 09/Jan/20 ]

Author:

{'name': 'Alex Taskov', 'email': 'alex.taskov@mongodb.com', 'username': 'alextaskov'}

Message: SERVER-42914 Implement random chunk selection policy for balancer for use in concurrency_*_with_balancer workloads

(cherry picked from commit 5e14accc4ebe76366d7d2747fd30b603bf02eac2)
Branch: v4.2
https://github.com/mongodb/mongo/commit/02657b811d5da15abf2158ac75126115d6954106

Comment by Githook User [ 18/Nov/19 ]

Author:

{'name': 'Matthew Saltz', 'username': 'saltzm', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-42914 Add failpoint to override balancer round interval (partial cherry pick)

(cherry picked from commit 5e14accc4ebe76366d7d2747fd30b603bf02eac2)
(cherry picked from commit 13b7634057ea8f3278176f15e1299a52d7e0cdc8)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6a2cc7922255ad7538b10be3d288b44bdb6a3dd6

Comment by Githook User [ 04/Nov/19 ]

Author:

{'name': 'Matthew Saltz', 'username': 'saltzm', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-42914 Add failpoint to override balancer round interval (partial cherry pick)

(cherry picked from commit 5e14accc4ebe76366d7d2747fd30b603bf02eac2)
Branch: v3.6
https://github.com/mongodb/mongo/commit/13b7634057ea8f3278176f15e1299a52d7e0cdc8

Comment by Alexander Taskov (Inactive) [ 28/Aug/19 ]

https://github.com/mongodb/mongo/commit/5e14accc4ebe76366d7d2747fd30b603bf02eac2

Comment by Githook User [ 28/Aug/19 ]

Author:

{'email': 'alex.taskov@mongodb.com', 'name': 'Alex Taskov', 'username': 'alextaskov'}

Message: SERVER-42914 Implement random chunk selection policy for balancer for use in concurrency_*_with_balancer workloads
Branch: master
https://github.com/mongodb/mongo/commit/5e14accc4ebe76366d7d2747fd30b603bf02eac2

Comment by Kaloian Manassiev [ 27/Aug/19 ]

This ticket is introducing some arbitrary testing functionality into the balancer, which has no relevance to any actual balancer use cases. This I am pretty sure goes against our server engineering guidelines to not introduce large test-only behaviours in the server and is something much better off done in the testing framework than compensating it in the balancer.

Before you push this code review, let's talk.

Generated at Thu Feb 08 05:01:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.