[SERVER-60730] shardsvrDropDatabase should always join existing coordinator Created: 15/Oct/21  Updated: 29/Oct/23  Resolved: 16/Nov/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 5.0.3, 5.1.0-rc0
Fix Version/s: 5.2.0, 5.0.5, 5.1.1

Type: Bug Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Allison Easton
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-60732 Test create collection after drop dat... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.1, v5.0
Sprint: Sharding EMEA 2021-11-01, Sharding EMEA 2021-11-15, Sharding EMEA 2021-11-29
Participants:
Linked BF Score: 132

 Description   

In the following scenario taken from the description of the linked BF, a drop database command can return that the operation was completed successfully when the database was actually not dropped. The drop database command should check the version of the existing coordinator to ensure that it is not joining the coordinator for a different database.

  1. the test creates unsharded collection "test.coll"
  2. mongos sends a _shardsvrDropDatabase to the primary shard s0:n0
  3. s0:n0 start a drop database coordinator dropDBcor_1
  4. s0:n0 drop "test.coll"
  5. s0:n0 manage to drop the database "test" (by removing the db entry in the config.databases) but the coordinator (dropDBcor_1) is still running.
  6. s0:n0 steps down and s0:n1 became the new primary of shard 0
  7. mongos receives an InterruptedDueToReplStateChange on the original _shardsvrDropDatabase command
  8. mongos retries the dropDatabase command but find out that the database have been already dropped. So it will simply return OK to the client
  9. mongos creates new database and collection "test.coll" while the old coordinator (dropDBcor_1) is still running.
  10. mongos attempts to drop the database "test" for the second time by sending a _shardsvrDropDatabase to s0:n1 and since dropDBcor_1 is still running it will join it. The problem is that `dropDBcor_1` is already at a late stage of the execution (running the cleanup phase) and it won't drop again the database.


 Comments   
Comment by Githook User [ 28/Oct/21 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-60730 shardsvrDropDatabase should always join existing coordinator

(cherry picked from commit 0bd0ddfb1d6875c3ce4390d30e0566b107256f29)
Branch: v5.0
https://github.com/mongodb/mongo/commit/19a776401ec32e4c1d796defaa6c2cd75b43ab2e

Comment by Githook User [ 28/Oct/21 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-60730 shardsvrDropDatabase should always join existing coordinator

(cherry picked from commit 0bd0ddfb1d6875c3ce4390d30e0566b107256f29)
Branch: v5.1
https://github.com/mongodb/mongo/commit/8def1746e85ab88314e4b2af3c2850ce580db1a7

Comment by Githook User [ 26/Oct/21 ]

Author:

{'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com', 'username': 'allisoneaston'}

Message: SERVER-60730 shardsvrDropDatabase should always join existing coordinator
Branch: master
https://github.com/mongodb/mongo/commit/0bd0ddfb1d6875c3ce4390d30e0566b107256f29

Generated at Thu Feb 08 05:50:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.