[SERVER-47087] Stepping down the primary when running 'dropDatabase' does not always reset the dropPending flag when the user operation is killed Created: 24/Mar/20  Updated: 29/Oct/23  Resolved: 10/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-46647 Stepping down the primary when runnin... Closed
related to SERVER-46123 Make the dropDatabase command abort i... Closed
is related to SERVER-46560 Make Abort index build logic determin... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-04-20
Participants:
Linked BF Score: 26

 Description   

I think this problem originated from the work to allow dropDatabase to abort in-progress index builds before dropping the database itself.

Here is the sequence of events leading up to a fatal assertion in a build failure:

  1. The primary node runs dropDatabase on test.
  2. dropDatabase sets the dropPending flag to true.
  3. dropDatabase needs to abort any in-progress index builds on the collections belonging to it prior to dropping the database itself.
  4. dropDatabase waits until the index builds are aborted.
  5. The node steps down from primary to secondary.
  6. The nodes oplog application tries to create a collection on the test database but gets the DatabaseDropPending error, which is fatal during oplog application.


 Comments   
Comment by Githook User [ 10/Apr/20 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-47087 dropDatabase interrupted due to a replication state change must reset the dropPending flag before exiting
Branch: master
https://github.com/mongodb/mongo/commit/5abfdd53a7508d17ce6ca0ca0447d3e13ddc746d

Comment by Gregory Wlodarek [ 10/Apr/20 ]

I've verified that this issue has gone away with SERVER-46560, I'll only be pushing a test as part of the work here to prevent future regressions.

Comment by Gregory Wlodarek [ 07/Apr/20 ]

Going to wait till SERVER-46560 is finished before proceeding with this as SERVER-46560 may resolve the BF.

Generated at Thu Feb 08 05:13:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.