Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89769

Test is not waiting for journal flusher before checkpointing

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Catalog and Routing
    • ALL
    • Hide

      1. Apply the attached patch
      2. Run collection_catalog_two_phase_drops.js

      Show
      1. Apply the attached patch 2. Run collection_catalog_two_phase_drops.js
    • CAR Team 2024-04-29, CAR Team 2024-05-13
    • 12

      collection_catalog_two_phase_drops.js tries to make sure the two phases of the drop are executed as expected, that is, it first checks the drop of the table is deferred, then forcing an oplog write and a fsync (which triggers a checkpoint) to check the ident were finally dropped.

      The second phase of a local drop collection is actually performed by the tiemstamp monitor, which means, that in order to actually drop a table, the snapshot history window must have passed, so, considering the advancement of the latest stable timestamp is done by the journal flusher which is an asynchronous thread, it is possible that the checkpoint triggered by fsync fails to persist the timestamp of the oplog write, making the second phase of the drop to never occur.

      This is a test issue, in production, the checkpoint thread (which is paused by the test) will eventually find the latest timestamp advanced by the journal flusher, successfully executing the second phase of the drop. We should ensure the oplog write was persisted before triggering a checkpoint.

            Assignee:
            josef.ahmad@mongodb.com Josef Ahmad
            Reporter:
            marcos.grillo@mongodb.com Marcos José Grillo Ramirez
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: