Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39957

Two phase drop by rename should delay the second phase until drop optime is both checkpointed and majority committed

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Storage
    • None
    • Storage Execution
    • ALL
    • Storage NYC 2019-03-11, Storage NYC 2019-03-25
    • 33
    • 3

    Description

      Currently, the old two-phase drop (by rename) executes the second phase (drop of the WT table file) when majority commit point moves past drop optime. If majority commit point is ahead of checkpoint and a crash happens after the second phase drop, on restart, the server will find the metadata of the collection still in the catalog (because it loads last checkpoint) but the actual WT file gets dropped.

      On restart, the server can detect that this is from an unclean shutdown by examining the mongod.lock file. Then it can safely remove the metadata of those collections which do not have WT table files.

      However, instead of crashing after the second phase drop, opening up backup cursor would cause similar issue which is harder to solve: there is also an inconsistency between WT table files and the catalog. But since we don't copy mongod.lock during backup, then the server does not trigger the code which reconciles the catalog. Then it tries to open a WT file which does not exist and hit this fassert.

      To fix this problem, we should delay the second phase until drop optime is checkpointed.

      Attachments

        Activity

          People

            backlog-server-execution Backlog - Storage Execution Team
            xiangyu.yao@mongodb.com Xiangyu Yao (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: