Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-11746

Investigate and tighten API requirements following a rollback error

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • StorEng - Refinement Pipeline

      Recently saw a MongoDB bug where a transaction was asked to rollback because of cache pressure. Ideally, the application should call rollback and cleanup resources. But in this case, MongoDB possibly proceeded to do more operations before calling commit. The commit call failed as WT_TXN_ERROR was set on the transaction when an initial rollback was issued.

      Ideally, we should have stronger safeguards in our API entry points to prevent the application from doing more work after receiving a rollback error. This ticket will tighten the API usage and prevent the application from diverging from the recommended error-handling path. Following needs to be accounted for:

      • Create a reproducer
        Understand the attached BF ticket to create a reproducer where an operation fails with WT_ROLLBACK (ideally because of operation timeout due to cache pressure). The API_END exit point calls __wt_txn_err_set() to set WT_TXN_ERROR on the running transaction. The reproducer should continue to call into WiredTIger several operations before calling commit. The commit will notice WT_TXN_ERROR and fail the call. Looking at the code, I do not see any place where we check WT_TXN_ERROR to prevent further operations.
      • Come up with the API requirements that will prevent most of the WiredTiger API calls once the rollback error is received. Ensure that rollback_transaction is still allowed as expected.
      • Confirm with patch testing that MongoDB doesn't break tighter API requirements. Work with the server to fix where needed before checking the code in WiredTiger.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            sulabh.mahajan@mongodb.com Sulabh Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: