[SERVER-77748] movePrimary coordinator does not clear database metadata in case of stepdown Created: 02/Jun/23  Updated: 29/Oct/23  Resolved: 15/Jun/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.0-rc2
Fix Version/s: 7.1.0-rc0, 7.0.0-rc4

Type: Bug Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Enrico Golfieri
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-71308 Enable featureFlag for resilient move... Closed
Assigned Teams:
Sharding EMEA
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0
Steps To Reproduce:

Stepdown on the coordinator shard during movePrimary coordinator after completion of kCommit phase and beginning of kExitCriticalSection.

Sprint: Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26
Participants:
Linked BF Score: 113

 Description   

If a primary failover happens during movePrimary operation, we could miss to clear database metadata on the original primary node of the coordiantor shard, leading to possible data loss.

As part of movePrimary coordinator, database metadata on primary node is explicitly cleared in kCommit phase, while on secondary nodes metadata is cleared indirectly when we exit the database recoverable critical section in kExitCriticalSection phase.

If a step-down happens between these two phases and a new primary node is elected on the coordinator shard we could miss clearing metadata on the new primary.

Consider the following scenario:

  • kCommit
    • N1 (primary)    ->   db metadata cleared
    • N2 (secondary) -> db metadata not cleared
  • kExitCriticalSection
    • N1 (secondary) ->  db metadata cleared
    • N2 (primary)     ->   db metadata not cleared


 Comments   
Comment by Githook User [ 15/Jun/23 ]

Author:

{'name': 'Enrico', 'email': 'enrico.golfieri@mongodb.com', 'username': 'enricogolfieri'}

Message: SERVER-77748 movePrimary coordinator does not clear database metadata in case of stepdown

(cherry-picked from commit 9efe3be1ce0f0afe188034a109605d7dc4b69d78)
Branch: v7.0
https://github.com/mongodb/mongo/commit/00eb33992ce23f2cbf2f1c2323b19289e67df3d8

Comment by Githook User [ 12/Jun/23 ]

Author:

{'name': 'Enrico', 'email': 'enrico.golfieri@mongodb.com', 'username': 'enricogolfieri'}

Message: SERVER-77748 movePrimary coordinator does not clear database metadata in case of stepdown
Branch: master
https://github.com/mongodb/mongo/commit/9efe3be1ce0f0afe188034a109605d7dc4b69d78

Generated at Thu Feb 08 06:36:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.