[SERVER-61742] Balancer may trip invariants due to concurrent usage of opCtx from different threads Created: 25/Nov/21  Updated: 29/Oct/23  Resolved: 01/Dec/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.2.0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Paolo Polato
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-43099 Reenable random chunk migration failp... Closed
Problem/Incident
is caused by SERVER-61113 _configsvrMoveChunk does not gossip t... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Balancer::_moveChunks can submit several asynchonous moveChunk requests concurrently. Each one of these calls to BalancerCommandsSchedulerImpl::requestMoveChunk will use the same opCtx. BalancerCommandsSchedulerImpl::requestMoveChunk will return a SemiFuture, that stores a copy of the opCtx pointer, and as part of its execution will use that opCtx to call processCommandResponse, which will call setLastOpToSystemLastOpTime which takes locks and thus alters the lock state of that opCtx. Thus several threads can concurently alter the opCtx's state, and this can lead to tripping invariants.

setLastOpToSystemLastOpTime should not be called from the executor callbacks.



 Comments   
Comment by Githook User [ 01/Dec/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-61742 Manually bump opTime on every moveChunk issued on mongos
Branch: master
https://github.com/mongodb/mongo/commit/630b966f4d6f502aeb30ff4706da60862a2f2b12

Generated at Thu Feb 08 05:53:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.