[SERVER-37591] MigrationSourceManager is not exception safe before `startClone` completes Created: 12/Oct/18  Updated: 29/Oct/23  Resolved: 06/Nov/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.1.4
Fix Version/s: 4.0.7, 4.1.5

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: neweng, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Sharding 2018-10-22, Sharding 2018-11-19
Participants:
Linked BF Score: 56

 Description   

The MigrationSourceManager::cleanupOnError logic invariants that the MSM has been installed on the specified collection before cleaning it up. However this may not be the case if the exception/return occurs early in the execution of MigrationSourceManager::startClone, such as here or here.



 Comments   
Comment by Githook User [ 11/Feb/19 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}

Message: SERVER-37591 Change MigrationSourceManager cleanup to only remove the MSM from the CSR when necessary

(cherry picked from commit 77823d2a5267b1b7917190e095f2a7243ad32a76)
Branch: v4.0
https://github.com/mongodb/mongo/commit/4f66cab740b95145c9ae87613803a91d67302408

Comment by Kaloian Manassiev [ 07/Nov/18 ]

Looking at the code, the same exact issue technically wouldn't exist in older versions, because in 4.0 this line is ignored whereas in master we do an early return.

However I noticed this one early return here as well which could happen if the collection was dropped immediately after migration started. We don't have tests which do that though, so we have never seen it in our tests.

I'd say we should backport it to 4.0 and to 3.6 only if it backports clean (since the migration manager changed).

I am approving the 4.0 backport.

Comment by Matthew Saltz (Inactive) [ 06/Nov/18 ]

Not necessarily, but the issue does exist in the code in older versions

Comment by Gregory McKeon (Inactive) [ 06/Nov/18 ]

matthew.saltz we've only seen this on master - do we need to backport?

Comment by Githook User [ 06/Nov/18 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}

Message: SERVER-37591 Change MigrationSourceManager cleanup to only remove the MSM from the CSR when necessary
Branch: master
https://github.com/mongodb/mongo/commit/77823d2a5267b1b7917190e095f2a7243ad32a76

Generated at Thu Feb 08 04:46:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.