[SERVER-81229] Move primary may not cleanup cloned collections on failure Created: 20/Sep/23  Updated: 24/Nov/23  Resolved: 22/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.1, 7.1.0-rc3, 7.2.0-rc1
Fix Version/s: 7.3.0-rc0, 7.2.0-rc2, 7.0.5

Type: Bug Priority: Major - P3
Reporter: Silvia Surroca Assignee: Marcos José Grillo Ramirez
Resolution: Fixed Votes: 0
Labels: shardingemea-qw
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-83230 Kill clone operation of move primary ... Closed
Assigned Teams:
Sharding EMEA
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.2, v7.0
Sprint: Sharding EMEA 2023-10-16, Sharding EMEA 2023-10-30, CAR Team 2023-11-13, CAR Team 2023-11-27
Participants:
Linked BF Score: 20
Story Points: 3

 Description   

TL;DR, in the event of a step-down of the primary node of the donor shard while the cloning phase of a movePrimary operation is in progress, the cloning procedure on recipient side is not aborted. This causes the presence of orphaned collections on the recipient and the consequent failure of any attempt to repeat the movePrimary operation (NamespaceExists error).

Technical details

During the cloning phase of the movePrimary operation, the DDL coordinator calls the _shardsvrCloneCatalogData command of the recipient, which creates and fetches all unsharded collections from the donor to the recipient. In the event of a failure (step-down) during this phase, the coordinator drops the data possibly cloned on the recipient and aborts the movePrimary operation.

The bug is that the coordinator doesn't abort the data cloning procedure possibly running on the recipient. The clean up of data, possibly already cloned on the recipient, doesn't resolve the problem since the cloning procedure could be running in background.

User impacts

The recipient shard could own orphaned collections which cause any attempt to repeat the movePrimary operation to fail. There is no evident business impact (data remain consistent) but a manual intervention on the recipient is required to drop these orphaned collections and then to allow a new movePrimary attempt to work.

Potential solution

The cloning phase of a movePrimary is heavily expensive in terms of execution times (in production it could take hours), so the cloning operation on the recipient side must not be joined but aborted. An idea is to tag the  _shardsvrCloneCatalogData operation and to kill it (using the tag) when the movePrimary operation is recovered by the coordinator (before to clean any cloned data).



 Comments   
Comment by Githook User [ 24/Nov/23 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-81229 Add replay protection to clone command of move primary

(cherry picked from commit 87497867081849cd2bd727d70635484f205ddfcd)
Branch: v7.0
https://github.com/mongodb/mongo/commit/8bd433723b427e4defc5ccab46416a57fc3bfb07

Comment by Githook User [ 24/Nov/23 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-81229 Add replay protection to clone command of move primary

(cherry picked from commit 87497867081849cd2bd727d70635484f205ddfcd)
Branch: v7.2
https://github.com/mongodb/mongo/commit/de3690dbb445e500d18bb48273bc1cb420c8e25f

Comment by Githook User [ 21/Nov/23 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-81229 Add replay protection to clone command of move primary
Branch: master
https://github.com/mongodb/mongo/commit/87497867081849cd2bd727d70635484f205ddfcd

Generated at Thu Feb 08 06:45:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.