-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 5.0.14, 6.0.4, 6.3.0-rc0, 6.2.0-rc6
-
Component/s: Sharding
-
None
-
Sharding EMEA
-
ALL
-
Sharding EMEA 2022-11-14, Sharding EMEA 2022-11-28, Sharding EMEA 2022-12-12, Sharding EMEA 2023-01-23
-
6
Unlikely scenario, but something that we have to fix. The problematic interleaving is the following one:
S1:N0: starts a drop database.
S1:N0: drops all sharded collections.
S1:N0: runs dropDB on all shards + clears the db metadata in all nodes.
S1:N0: steps down but managed to send the command to remove the authoritative data to the CSRS.
S1:N1: Steps up.
S1:N1: Some operation recovers the metadata associated to the db being dropped.
CSRS: Removes the db entry from config.databases.
S1:N1: Resumes the execution of the dropDatabase.
S1:N1: No information associated to that dbName is present on the CSRS, so we jump to the second phase of the coordinator, in which we send a flushRoutingTable to all nodes but the primary.
S1:N1: Completes the execution of the dropDatabase but the primary node still believes it is the primary shard for that db name.
- is related to
-
SERVER-73391 Use recoverable critical section for drop database
- Closed
- split to
-
SERVER-73390 Mitigate database version regression bug on drop database
- Closed
-
SERVER-73391 Use recoverable critical section for drop database
- Closed