Validate in repair mode has invalid uses of writeConflictRetry()

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Fully Compatible
    • ALL
    • Storage Execution 2026-03-16, Storage Execution 2026-03-30
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The index consistency repair functionality in index_consistency.cpp wraps its writes in writeConflictRetry(). This is invalid as the higher-level validation code calling into this (ValidateAdaptor::traverseRecordStore) is holding onto a cursor that expects a consistent snapshot. If the write actually produces a write conflict, it will abandon the snapshot before retrying, leading to the cursor possibly skipping records or hitting an invariant failure if it detects that it has gone backwards.

      In practice this code should never actually produce a write conflict, as it requires standalone mode and validate will be the only thing accessing the database. It may be correct to simply remove all instances of writeConflictRetry() from the validation repair logic?

      To reproduce the problem, add _recoveryUnit().abandonSnapshot(); to WriteConflictRetryAlgorithm::operator() prior to the call of f() and then run the validate dbtests. This should hit an out-of-order read invariant failure. Everything that calls writeConflictRetry() has to be okay with the stapshot being abandoned, and this change should have no functional effects (and just make performance worse).

            Assignee:
            Thomas Goyne
            Reporter:
            Thomas Goyne
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: