Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8695

Remove file_close_sync config and disallow single file checkpoint

    • 8
    • Storage - Ra 2022-02-07, Storage - Ra 2022-02-21, Storage - Ra 2022-03-07

      Doing an operation that requires exclusive access to a handle (such as verify) closes existing references, which checkpoints it if the tree is dirty. But this only checkpoints that tree, not others that might need to be in sync with it (such as the history store, or trees that were recently also updated as part of a single transaction). Consequently the on-disk state at that point is not consistent and this creates a problem if the system crashes without taking another full checkpoint.

      There's code that prohibits exclusive ops on dirty trees that have a stable timestamp (see WT-7750) but the problem also exists with non-timestamped tables. I'll attach a Python test that splits a transaction across two tables and then observes half of it after crashing. It's also possible to create a comparable situation where values needed by RTS are not in the history store because the history store wasn't checkpointed at the same time, but doing so is somewhat more involved.

      This issue is similar to WT-4070 (which appears to be the same problem for timestamped tables) and it's clearly related to WT-4414 as well, except that WT-4414 was apparently fixed.

            chenhao.qu@mongodb.com Chenhao Qu
            dholland+wt@sauclovia.org David Holland
            0 Vote for this issue
            13 Start watching this issue