[SERVER-77868] Balancer secondary thread should reset state on step up Created: 07/Jun/23  Updated: 29/Oct/23  Resolved: 09/Jun/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.1.0-rc0, 6.0.6, 7.0.0-rc2
Fix Version/s: 7.1.0-rc0, 6.0.7, 7.0.0-rc4

Type: Bug Priority: Major - P3
Reporter: Allison Easton Assignee: Allison Easton
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
causes SERVER-78659 The secondary thread of the Balancer ... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0, v6.0
Sprint: Sharding EMEA 2023-06-12
Participants:
Linked BF Score: 137

 Description   

The secondary thread is using a ScopedTaskExecutor to schedule callbacks for the actions it submits. This means that those continuations are canceled on stepdown when the secondary thread exits. However, we are never clearing the state of _outstandingStreamingOps (which is decremented in those callbacks). This means that if we have 20 outstanding operations for all collections on stepdown, on the next stepup of that node, our starting point is 20 and we count up from there. In the case that there are multiple stepdowns/stepups of the same node, this can accumulate and hit kMaxOutstandingStreamingOperations which means that the node will never issue any more defragmentation operations.

The solution to this would be as simple as resetting _outstandingStreamingOps to 0 on step up. We should also consider the auto merger policy which is using a similar approach to cap the outstanding merges. We likely have the same issue with _outstandingActions in the autoMerger and should also set that value to 0 on initialization.



 Comments   
Comment by Githook User [ 13/Jun/23 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-77868 Balancer secondary thread should reset state on step up
(cherry picked from commit c80210290982aaf7711993b98a1c39e02586f8a4)
Branch: v6.0
https://github.com/mongodb/mongo/commit/8a97351e0fb6d21b1d9d431a66c1a1da833b0d8d

Comment by Githook User [ 12/Jun/23 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-77868 Balancer secondary thread should reset state on step up

(cherry picked from commit c80210290982aaf7711993b98a1c39e02586f8a4)
Branch: v7.0
https://github.com/mongodb/mongo/commit/5ee2b07959ba47a748a4cffdba9521a028e28a2f

Comment by Githook User [ 08/Jun/23 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-77868 Balancer secondary thread should reset state on step up
Branch: master
https://github.com/mongodb/mongo/commit/c80210290982aaf7711993b98a1c39e02586f8a4

Generated at Thu Feb 08 06:36:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.