[SERVER-55552] Unreplicated collection idents can get dropped before the drop in the durable catalog becomes both checkpointed and older than the oldest timestamp Created: 26/Mar/21  Updated: 29/Oct/23  Resolved: 23/Aug/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 5.0.0, 4.4.4, 4.2.13
Fix Version/s: 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-55397 Index build restart ident drops are n... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2021-04-19, Execution Team 2021-06-14, Execution Team 2021-08-23, Execution Team 2021-09-06
Participants:
Linked BF Score: 22

 Description   

For two-phase drops, unreplicated collection drops get added to the ident reaper without a timestamp. This causes the ident reaper to perform the actual table drop the next time it runs.

For contrast, replicated collection drops do not have their table dropped until the drop timestamp becomes both checkpointed and older than the oldest timestamp. This is because earlier point-in-time reads may still be accessing the underlying table.

This issue affects restoring backed-up data files where the table is dropped but the catalog entry still exists in the _mdb_catalog.

Below is the order of operations that can cause a fatal assertion when restoring:

  • Create two collections, say A.B and A.system.profile (the system.profile collection is unreplicated by design)
  • Perform a checkpoint at Timestamp(10)
  • dropDatabase(A)
    • First, we drop all replicated collections (A.B in this case) and wait for it to replicate
    • We defer the table drop to Timestamp(15)
    • Then we drop all unreplicated collections (A.system.profile in this case)
    • We defer the table drop to Timestamp(0)
  • The TimestampMonitor now runs and starts reaping drop-pending idents before Timestamp(5)
    • Our A.system.profile table is now dropped via WT_SESSION::drop()
    • WiredTiger drops aren't transactional and they are always removed immediately
  • Open a backup cursor
    • checkpointTimestamp is Timestamp(10)
    • The backup cursor does not report A.system.profile as part of the backup
    • The _mdb_catalog gets copied, which at Timestamp(10), the A.system.profile collection still has an entry in the catalog
  • Close the backup cursor
  • Startup a mongod on the copied files
    • While loading the catalog, it sees both A.B and A.system.profile in the _mdb_catalog
    • Fatal assertion: "Version: Unable to find metadata for table:A.system.profile"

This issue is not limited to the system.profile collection only, but any unreplicated collection. An approach to resolve this could involve startup recovery ignoring the assertion for unreplicated collections and removing the durable catalog entry.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 23/Aug/21 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-55552 Backup can copy the state where the collection ident is dropped but not the catalog entry for un-replicated collections
Branch: master
https://github.com/mongodb/mongo/commit/95668c6ce7c0718bd2a2e44e0bbc557606d5172f

Comment by Githook User [ 23/Aug/21 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-55552 Backup can copy the state where the collection ident is dropped but not the catalog entry for un-replicated collections
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/6b5b0664d9154f8179c51c9579779718254f9269

Comment by Githook User [ 20/Aug/21 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-55552 Backup can copy the state where the collection ident is dropped but not the catalog entry for un-replicated collections
Branch: SERVER-55552
https://github.com/10gen/mongo-enterprise-modules/commit/36b4133feabda83abc4a446a6cc533ab1c9a4240

Generated at Thu Feb 08 05:36:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.