[SERVER-70322] BalancerStatsRegistry inappropriately constructs ResourceId outside ResourceIdFactory Created: 07/Oct/22  Updated: 01/Nov/22  Resolved: 01/Nov/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 6.0.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Tommaso Tocci
Resolution: Duplicate Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File 0001-Repro-SERVER-70322.patch    
Issue Links:
Depends
is depended on by SERVER-70467 Prevent manually using RESOURCE_MUTEX Closed
Duplicate
duplicates SERVER-70864 Get rid of fine grained scoped range ... Closed
Related
related to SERVER-70154 DatabaseShardingState inappropriately... Closed
related to SERVER-70431 Checkpointing manually constructs Res... Closed
related to SERVER-69893 Add the ability to acquire lock manag... Closed
is related to SERVER-70536 Invalid construction of "checkpoint" ... Closed
Operating System: ALL
Participants:

 Description   

Constructing `ResourceId`s outside of ResourceIdFactory is unsafe – see SERVER-70154 (which is an analogous situation) for more detail.

BalancerStatsRegistry constructs a ResourceId here. This can crash the server.



 Comments   
Comment by Tommaso Tocci [ 01/Nov/22 ]

Fixed by SERVER-70864

Comment by Marcos José Grillo Ramirez [ 19/Oct/22 ]

Passing it to pierlauro.sciarelli@mongodb.com because he has a more general solution proposal.

Comment by Jordi Serra Torrens [ 07/Oct/22 ]

SERVER-69893 proposed a new LockManager lock type that represents a mutex lockable by name. Using that would be appropriate for this case; however, current decision is not to introduce that new LockManager type.

On master, we believe there is actually no need for BalancerStatsRegistry to use that RESOURCE_MUTEX because the only usage of the "weak" variant of ScopedRangeDeleterLock is here, and I believe it would enough to lock the rangeDeleter collection in MODE_IX, because:
(1) It already conflicts with the "strong" variant.
(2) The work currently done under that lock does not need to serialize on a particular collection uuid.

On v6.0, the situation is more complex, since the weak variant is used on the FCV upgrade procedure. More analysis is needed to find a solution.

(cc allison.easton@mongodb.com)

Generated at Thu Feb 08 06:15:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.