Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81493

Handle StorageUnavailableException when resetting WiredTiger cursors

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 7.2.0-rc0
    • None
    • None
    • None
    • Storage Execution NAMER
    • Fully Compatible
    • ALL
    • v7.0
    • Execution NAMR Team 2023-10-16
    • 120

    Description

      A call to WT_CURSOR::reset can roll back due to cache pressure. There's a long-standing assumption (since at least 3.6) that ignoring these rollbacks is safe because the transaction is getting killed anyway. This assumption is incorrect: query plans reset the cursor before performing the write (e.g. the update stage). When the exception is swallowed, the write proceeds and eventually fails to commit due to the transaction requiring rollback.

      I've linked a build failure where a replica set reconfig raced with a test designed to generate a transaction too large to fit in cache. WiredTiger rolled back the oldest transaction to ease the cache pressure, the oldest transaction happened to be the reconfig thread persisting the new configuration, and not handling that exception eventually failed an invariant when trying to commit the transaction.

      We should not swallow StorageUnavailableExceptions in WiredTigerRecordStoreCursorBase::save() and WiredTigerRecordStore::RandomCursor::save() and handle the exception accordingly up in the call chain.

      We should also investigate if callers of WiredTigerIndexCursorGeneric::resetCursor() and PlanYieldPolicy::yieldOrInterrupt() are similarly impacted.

      Attachments

        Activity

          People

            louis.williams@mongodb.com Louis Williams
            josef.ahmad@mongodb.com Josef Ahmad
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: