[SERVER-20559] Race condition in shard registry during concurrent sharding operations Created: 22/Sep/15  Updated: 07/Oct/15  Resolved: 23/Sep/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.1.8
Fix Version/s: 3.1.9

Type: Bug Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-19929 Audit sharding code for potential use... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

This bug was uncovered through the FSM concurrency suite. To reproduce more simply, run moveChunk, splitChunk, and mergeChunk commands at a high degree of concurrency (>20 threads).

Sprint: TIG A (10/09/15)
Participants:

 Description   

The moveChunk command routinely performs a reload of all shards in the shard registry, which clears the shard registry's ShardMap objects. The ShardMap objects contain shared pointers to Shard objects, so the Shard objects are deleted on these reloads.

Other shard commands such as splitChunk and mergeChunks also obtain shared pointers to these Shard objects to grab the Shard's RemoteCommandTargeter object, which is owned by the Shard. The commands release the shared pointer to the Shard object but continue to use the RemoteCommandTargeter, so if the Shard is deleted during a concurrent moveChunk, then its RemoteCommandTargeter is deleted along with it, leaving the splitChunk or mergeChunk commands with an invalid reference to a deleted RemoteCommandTargeter. When they then attempt to use the RemoteCommandTargeter, a use-after-free occurs.

Potential fix: remove the intermediate _targeter() method so that shared_ptr to the Shard is in scope for as long as the RemoteCommandTargeter.



 Comments   
Comment by Githook User [ 23/Sep/15 ]

Author:

{u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}

Message: SERVER-20559 race condition in shard registry during concurrent sharding operations

Closes #1026

Signed-off-by: Kamran Khan <kamran.khan@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/50c61232e48fc0fed1322d8bdfc2c600649737ae

Generated at Thu Feb 08 03:54:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.