Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-55552

Unreplicated collection idents can get dropped before the drop in the durable catalog becomes both checkpointed and older than the oldest timestamp

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.1.0-rc0
    • Affects Version/s: 5.0.0, 4.4.4, 4.2.13
    • Component/s: Storage
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Execution Team 2021-04-19, Execution Team 2021-06-14, Execution Team 2021-08-23, Execution Team 2021-09-06
    • 22

      For two-phase drops, unreplicated collection drops get added to the ident reaper without a timestamp. This causes the ident reaper to perform the actual table drop the next time it runs.

      For contrast, replicated collection drops do not have their table dropped until the drop timestamp becomes both checkpointed and older than the oldest timestamp. This is because earlier point-in-time reads may still be accessing the underlying table.

      This issue affects restoring backed-up data files where the table is dropped but the catalog entry still exists in the _mdb_catalog.

      Below is the order of operations that can cause a fatal assertion when restoring:

      • Create two collections, say A.B and A.system.profile (the system.profile collection is unreplicated by design)
      • Perform a checkpoint at Timestamp(10)
      • dropDatabase(A)
        • First, we drop all replicated collections (A.B in this case) and wait for it to replicate
        • We defer the table drop to Timestamp(15)
        • Then we drop all unreplicated collections (A.system.profile in this case)
        • We defer the table drop to Timestamp(0)
      • The TimestampMonitor now runs and starts reaping drop-pending idents before Timestamp(5)
        • Our A.system.profile table is now dropped via WT_SESSION::drop()
        • WiredTiger drops aren't transactional and they are always removed immediately
      • Open a backup cursor
        • checkpointTimestamp is Timestamp(10)
        • The backup cursor does not report A.system.profile as part of the backup
        • The _mdb_catalog gets copied, which at Timestamp(10), the A.system.profile collection still has an entry in the catalog
      • Close the backup cursor
      • Startup a mongod on the copied files
        • While loading the catalog, it sees both A.B and A.system.profile in the _mdb_catalog
        • Fatal assertion: "Version: Unable to find metadata for table:A.system.profile"

      This issue is not limited to the system.profile collection only, but any unreplicated collection. An approach to resolve this could involve startup recovery ignoring the assertion for unreplicated collections and removing the durable catalog entry.

            Assignee:
            gregory.wlodarek@mongodb.com Gregory Wlodarek
            Reporter:
            gregory.wlodarek@mongodb.com Gregory Wlodarek
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: