[SERVER-81493] Handle StorageUnavailableException when resetting WiredTiger cursors Created: 27/Sep/23  Updated: 29/Oct/23  Resolved: 12/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Josef Ahmad Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-78298 Add Support for RHEL 7.9 x86 (and rem... Closed
Related
is related to WT-11871 Define whether or not cursor reset ca... Open
Assigned Teams:
Storage Execution NAMER
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0
Sprint: Execution NAMR Team 2023-10-16
Participants:
Linked BF Score: 120

 Description   

A call to WT_CURSOR::reset can roll back due to cache pressure. There's a long-standing assumption (since at least 3.6) that ignoring these rollbacks is safe because the transaction is getting killed anyway. This assumption is incorrect: query plans reset the cursor before performing the write (e.g. the update stage). When the exception is swallowed, the write proceeds and eventually fails to commit due to the transaction requiring rollback.

I've linked a build failure where a replica set reconfig raced with a test designed to generate a transaction too large to fit in cache. WiredTiger rolled back the oldest transaction to ease the cache pressure, the oldest transaction happened to be the reconfig thread persisting the new configuration, and not handling that exception eventually failed an invariant when trying to commit the transaction.

We should not swallow StorageUnavailableExceptions in WiredTigerRecordStoreCursorBase::save() and WiredTigerRecordStore::RandomCursor::save() and handle the exception accordingly up in the call chain.

We should also investigate if callers of WiredTigerIndexCursorGeneric::resetCursor() and PlanYieldPolicy::yieldOrInterrupt() are similarly impacted.



 Comments   
Comment by Githook User [ 11/Oct/23 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-81493 Don't swallow StorageUnavailable errors in cursor reset
Branch: master
https://github.com/mongodb/mongo/commit/9cf58f13a9d4bc1f9e525a31cefd1805e88c8a58

Generated at Thu Feb 08 06:46:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.