[SERVER-57476] Operation may block on prepare conflict while holding oplog slot, stalling replication indefinitely Created: 05/Jun/21  Updated: 29/Oct/23  Resolved: 10/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.0, 4.4.0, 5.0.0-rc0
Fix Version/s: 4.2.15, 4.4.7, 5.0.0-rc2, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-60682 TransactionCoordinator may block acqu... Closed
related to SERVER-65821 Deadlock during setFCV when there are... Closed
related to SERVER-57506 Eagerly fail when a timestamped trans... Closed
is related to SERVER-54284 ExceptionFor<ErrorCodes::WriteConflic... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4, v4.2
Sprint: Repl 2021-06-14
Participants:
Case:

 Description   

wiredTigerPrepareConflictRetry() doesn't release any storage resources when WiredTiger returns WT_PREPARE_CONFLICT. This is because prepare conflicts are expected to resolve quickly and the storage transaction may already have a snapshot compatible with the commitTimestamp of the prepared transaction. However, committing or aborting a prepared transaction depends on the ability of the replica set to replicate and confirm majority-committed writes. If the operation blocked in wiredTigerPrepareConflictRetry() has reserved an oplog slot, then oplog readers will block until the oplog hole is resolved. But oplog readers being blocked prevents new incoming writes from becoming majority-committed, which can prevent a prepared transaction from exiting the kPrepared state.


Thread 56: "conn2" (Thread 0x7f4ee3799700 (LWP 15315))
...
#11 mongo::Interruptible::waitForConditionOrInterrupt<std::unique_lock<mongo::latch_detail::Latch>, mongo::WiredTigerSessionCache::waitUntilPreparedUnitOfWorkCommitsOrAborts(mongo::OperationContext*, uint64_t)::<lambda()> > (waitTimer=0x0, pred=..., m=..., cv=..., this=0x55802d3d26c0) at src/mongo/util/interruptible.h:466
#12 mongo::WiredTigerSessionCache::waitUntilPreparedUnitOfWorkCommitsOrAborts (this=<optimized out>, opCtx=opCtx@entry=0x55802d3d26c0, lastCount=<optimized out>, lastCount@entry=0) at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:369
#13 0x00007f4f0c17c5e1 in mongo::wiredTigerPrepareConflictRetry<mongo::WiredTigerIndexUnique::_keyExists(mongo::OperationContext*, WT_CURSOR*, char const*, size_t)::<lambda()> >(mongo::OperationContext *, mongo::WiredTigerIndexUnique::<lambda()> &&) (opCtx=0x55802d3d26c0, f=...) at src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.h:192
#14 0x00007f4f0c17ce24 in mongo::WiredTigerIndexUnique::_keyExists (this=this@entry=0x55802cba1380, opCtx=opCtx@entry=0x55802d3d26c0, c=<optimized out>, c@entry=0x55802d8d1400, buffer=0x55802d43bc38 "+\002\004", size=size@entry=3) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:1400
#15 0x00007f4f0c17d0da in mongo::WiredTigerIndexUnique::_insertTimestampSafe (this=0x55802cba1380, opCtx=0x55802d3d26c0, c=0x55802d8d1400, keyString=..., dupsAllowed=<optimized out>) at src/mongo/util/shared_buffer.h:204
#16 0x00007f4f0c17e2dc in mongo::WiredTigerIndexUnique::_insert (this=0x55802cba1380, opCtx=0x55802d3d26c0, c=0x55802d8d1400, keyString=..., dupsAllowed=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:1474
#17 0x00007f4f0c178b50 in mongo::WiredTigerIndex::insert (this=0x55802cba1380, opCtx=0x55802d3d26c0, keyString=..., dupsAllowed=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:267
#18 0x00007f4f096657d0 in mongo::AbstractIndexAccessMethod::insertKeys(mongo::OperationContext*, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, mongo::RecordId const&, mongo::InsertDeleteOptions const&, std::function<mongo::Status (mongo::KeyString::Value const&)>&&, long*) (this=0x55802cc25440, opCtx=0x55802d3d26c0, keys=std::set with 1 element = {...}, loc=..., options=..., onDuplicateKey=..., numInserted=0x7f4ee3795510) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
#19 0x00007f4f0966596e in mongo::AbstractIndexAccessMethod::insertKeysAndUpdateMultikeyPaths(mongo::OperationContext*, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, std::vector<std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> >, std::allocator<std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > > > const&, mongo::RecordId const&, mongo::InsertDeleteOptions const&, std::function<mongo::Status (mongo::KeyString::Value const&)>&&, long*) (this=0x55802cc25440, opCtx=0x55802d3d26c0, keys=std::set with 1 element = {...}, multikeyMetadataKeys=std::set with 0 elements, multikeyPaths=std::vector of length 1, capacity 1 = {...}, loc=..., options=..., onDuplicateKey=..., numInserted=0x7f4ee3795510) at src/mongo/db/index/index_access_method.cpp:164
#20 0x00007f4f0a2bae14 in mongo::IndexCatalogImpl::_indexKeys (this=<optimized out>, opCtx=0x55802d3d26c0, index=<optimized out>, keys=std::set with 1 element = {...}, multikeyMetadataKeys=..., multikeyPaths=std::vector of length 1, capacity 1 = {...}, obj=owned BSONObj 33 bytes @ 0x55802d35bf2b = {...}, loc=..., options=..., keysInsertedOut=0x7f4ee37958e8) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/std_function.h:402
#21 0x00007f4f0a2c2bce in mongo::IndexCatalogImpl::_indexFilteredRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, index=0x55802d35b810, bsonRecords=..., keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1408
#22 0x00007f4f0a2c2f34 in mongo::IndexCatalogImpl::_indexRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, index=0x55802d35b810, bsonRecords=std::vector of length 3, capacity 3 = {...}, keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1435
#23 0x00007f4f0a2c3032 in mongo::IndexCatalogImpl::indexRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, bsonRecords=std::vector of length 3, capacity 3 = {...}, keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1591
#24 0x00007f4f0a29baab in mongo::CollectionImpl::_insertDocuments (this=0x55802d35b100, opCtx=0x55802d3d26c0, begin={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, end={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, opDebug=0x55802d359450) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
#25 0x00007f4f0a2a034e in mongo::CollectionImpl::insertDocuments (this=this@entry=0x55802d35b100, opCtx=opCtx@entry=0x55802d3d26c0, begin=begin@entry={stmtId = -1, oplogSlot = {_timestamp = {i = 5, secs = 1622990608}, _term = 1}, doc = owned BSONObj 33 bytes @ 0x55802d35bf2b = {[_id] = {"$oid": "60bcdf109bb2288d9de42896"}, [a] = 1.0}}, end=end@entry={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, opDebug=0x55802d359450, fromMigrate=false) at src/mongo/db/catalog/collection_impl.cpp:505
#26 0x00007f4f0aca3963 in mongo::(anonymous namespace)::insertDocuments (opCtx=0x55802d3d26c0, collection=<optimized out>, begin=..., end={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, fromMigrate=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/stl_iterator.h:845
#27 0x00007f4f0aca45a1 in mongo::(anonymous namespace)::insertBatchAndHandleErrors (fromMigrate=<optimized out>, out=<optimized out>, lastOpFixer=<optimized out>, batch=std::vector of length 3, capacity 3 = {...}, wholeOp=..., opCtx=<optimized out>) at src/mongo/db/catalog_raii.h:131
#28 mongo::performInserts (opCtx=<optimized out>, opCtx@entry=0x55802d3d26c0, wholeOp=..., fromMigrate=fromMigrate@entry=false) at src/mongo/db/ops/write_ops_exec.cpp:621
#29 0x00007f4f04442cef in mongo::(anonymous namespace)::CmdInsert::Invocation::runImpl (this=0x55802d4090c0, opCtx=0x55802d3d26c0, result=...) at src/mongo/db/commands/write_commands/write_commands.cpp:299
#30 0x00007f4f044403dd in mongo::(anonymous namespace)::WriteCommand::InvocationBase::run (this=0x55802d4090c0, opCtx=0x55802d3d26c0, result=0x55802cf10cc0) at src/mongo/db/commands/write_commands/write_commands.cpp:241
#31 0x00007f4f08db14ff in mongo::CommandHelpers::runCommandInvocation (opCtx=0x55802d3d26c0, request=..., invocation=0x55802d4090c0, response=0x55802cf10cc0) at src/mongo/db/commands.cpp:184
#32 0x00007f4f0469364f in mongo::(anonymous namespace)::runCommandImpl (opCtx=<optimized out>, invocation=<optimized out>, request=..., replyBuilder=0x55802cf10cc0, startOperationTime=..., behaviors=..., extraFieldsBuilder=0x7f4ee37968e0, sessionOptions=...) at src/mongo/db/service_entry_point_common.cpp:828
#33 0x00007f4f04695cf9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=<optimized out>, request=..., replyBuilder=<optimized out>, behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1226
#34 0x00007f4f04697070 in mongo::(anonymous namespace)::<lambda()>::operator() (__closure=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
#35 mongo::(anonymous namespace)::receivedCommands (opCtx=0x55802d3d26c0, message=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1440
#36 0x00007f4f04697cfd in mongo::ServiceEntryPointCommon::handleRequest (opCtx=0x55802d3d26c0, m=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1745

Thread 43: "conn1" (Thread 0x7f4ee9cbf700 (LWP 15300))
...
#11 mongo::Interruptible::waitForConditionOrInterrupt<std::unique_lock<mongo::latch_detail::Latch>, mongo::WiredTigerOplogManager::waitForAllEarlierOplogWritesToBeVisible(const mongo::WiredTigerRecordStore*, mongo::OperationContext*)::<lambda()> > (waitTimer=0x0, pred=..., m=..., cv=..., this=0x55802d3d18c0) at src/mongo/util/interruptible.h:466
#12 mongo::WiredTigerOplogManager::waitForAllEarlierOplogWritesToBeVisible (this=<optimized out>, oplogRecordStore=<optimized out>, opCtx=0x55802d3d18c0) at src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp:147
#13 0x00007f4f09ea2244 in mongo::CollectionScan::doWork (this=0x55802cb75c00, out=0x7f4ee9cbb8c8) at src/mongo/db/exec/plan_stage.h:383
#14 0x00007f4f09ec3316 in mongo::PlanStage::work (this=0x55802cb75c00, out=out@entry=0x7f4ee9cbb8c8) at src/mongo/db/exec/plan_stage.cpp:47
#15 0x00007f4f09f20f7a in mongo::PlanExecutorImpl::_getNextImpl (this=0x55802d379960, objOut=0x7f4ee9cbba30, dlOut=0x0) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
#16 0x00007f4f09f219bb in mongo::PlanExecutorImpl::getNext (this=<optimized out>, objOut=0x7f4ee9cbbb28, dlOut=<optimized out>) at src/mongo/db/query/plan_executor_impl.cpp:413
#17 0x00007f4f0440f753 in mongo::(anonymous namespace)::FindCmd::Invocation::run (this=0x55802cf10680, opCtx=0x55802d3d18c0, result=<optimized out>) at src/mongo/db/commands/find_cmd.cpp:516
#18 0x00007f4f08db14ff in mongo::CommandHelpers::runCommandInvocation (opCtx=0x55802d3d18c0, request=..., invocation=0x55802cf10680, response=0x55802cf104a0) at src/mongo/db/commands.cpp:184
#19 0x00007f4f04692d72 in mongo::(anonymous namespace)::runCommandImpl (opCtx=<optimized out>, invocation=<optimized out>, request=..., replyBuilder=0x55802cf104a0, startOperationTime=..., behaviors=..., extraFieldsBuilder=0x7f4ee9cbc8e0, sessionOptions=...) at src/mongo/db/service_entry_point_common.cpp:859
#20 0x00007f4f04695cf9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=<optimized out>, request=..., replyBuilder=<optimized out>, behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1226
#21 0x00007f4f04697070 in mongo::(anonymous namespace)::<lambda()>::operator() (__closure=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
#22 mongo::(anonymous namespace)::receivedCommands (opCtx=0x55802d3d18c0, message=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1440
#23 0x00007f4f04697cfd in mongo::ServiceEntryPointCommon::handleRequest (opCtx=0x55802d3d18c0, m=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1745



 Comments   
Comment by Githook User [ 11/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict.

(cherry picked from commit 1e7e343fb6c90fbf0c62deabf61630353e2e5e29)
Branch: v4.2
https://github.com/mongodb/mongo/commit/379db12e4d046c82f60732bcf11c98019f04bbf5

Comment by Githook User [ 10/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict.

(cherry picked from commit 1e7e343fb6c90fbf0c62deabf61630353e2e5e29)
Branch: v4.4
https://github.com/mongodb/mongo/commit/cfffcff7de7417c29a1d95d7a856c7874b9eebd1

Comment by Githook User [ 10/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict.

(cherry picked from commit 1e7e343fb6c90fbf0c62deabf61630353e2e5e29)
Branch: v5.0
https://github.com/mongodb/mongo/commit/2b62a316d722b406fbbc48bac492427844f244a3

Comment by Githook User [ 10/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict.
Branch: master
https://github.com/mongodb/mongo/commit/1e7e343fb6c90fbf0c62deabf61630353e2e5e29

Comment by Daniel Gottlieb (Inactive) [ 09/Jun/21 ]

Daniel Gottlieb, does this have the possibility of user operations that didn't previously encounter write conflict errors, or were unlikely to, to now possibly see a write conflict?

Changing my answer: no. This should only create {{WriteConflictException}}s that get retried internally and never get returned to a user. This is based off the assumption that (MDB) transactions don't generate their oplog timestamps until the very end.

Comment by Githook User [ 09/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: Revert "SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict."

This reverts commit 44419183dfabb246c0a112f6060a372e90ee0d44.
Branch: master
https://github.com/mongodb/mongo/commit/990f8ede6f89e44315688d4f679c28ff53b672d1

Comment by Githook User [ 09/Jun/21 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-57476: Return a WriteConflict when a timestamped transaction hits a prepare conflict.
Branch: master
https://github.com/mongodb/mongo/commit/44419183dfabb246c0a112f6060a372e90ee0d44

Comment by Daniel Gottlieb (Inactive) [ 08/Jun/21 ]

Thanks for the thoughtful questions bruce.lucas. Given the interest in this scenario, I appreciate the timely feedback that helps us quickly gain confidence that the described change is correct, complete and minimizes impact on users.

Daniel Gottlieb, does this have the possibility of user operations that didn't previously encounter write conflict errors, or were unlikely to, to now possibly see a write conflict?

Operations that were unlikely to encounter a write conflict are now more likely to observe one[1]. But I don't believe we return a write conflict for anything that couldn't do so previously.

I've been using this chart to describe the interleaving that leads to a stall:

| Preparer                              | VectoredInsert            | OplogVisibility Ts |
|---------------------------------------+---------------------------+--------------------|
| BeginTxn                              |                           |                    |
| Write A                               |                           |                    |
|                                       | BeginTxn                  |                    |
|                                       | Preallocates TS(10, 11)   |                  9 |
| (side txn commits prepare oplog @ 12) |                           |                    |
| Prepare 12                            |                           |                    |
|                                       | Write A (PrepareConflict) |                    |

But if we reorder who wins the race to allocate a timestamp, we'd now create an "unnecessary" write conflict (the prepare conflict would resolve, allowing the "vectored insert" to make progress without retrying the whole operation):

| Preparer                              | VectoredInsert                             | OplogVisibility Ts |
|---------------------------------------+--------------------------------------------+--------------------|
| BeginTxn                              |                                            |                    |
| Write A                               |                                            |                    |
| (side txn commits prepare oplog @ 10) |                                            |                 10 |
| Prepare 10                            |                                            |                    |
|                                       | BeginTxn                                   |                    |
|                                       | Preallocates TS(11, 12)                    |                 10 |
|                                       | Write A (PrepareConflict -> WriteConflict) |                    |
| TS(10) becomes majority committed     |                                            |                    |

That's not necessarily a problem (unless it makes operations likely to see write conflicts at a high rate), but it may be something we need to be aware of in case customers have questions.

I don't think we'll see operations write conflict at a high rate, but it's hard to say for certain. The above diagram demonstrates how the benign case can result in a write conflict while we wait for a majority write. The window can be large (network timescale versus CPU timescale for healthy systems). But to be clear, I claim:
1) For an application retrying quickly, the overall latency would be similar between the two behaviors. Obviously retrying the write conflict wastes more CPU and/or network.
2) Any newly returned write conflict could be unlucky with how timestamps were allocated and could have resulted in a stall. We won't introduce any new write conflicts for a code path that's inherently safe.

That said, we aren't fully aware of all the cases where we have this race. From the stalls we've investigated, some don't seem to be this specific scenario with vectored inserts against a collection with a non-_id unique index (but we were unable to rule it out entirely – we didn't have visibility into unsharded collections and their index specifications).

Also, I wonder if there is any mechanism in the error returned to indicate to a user why the are seeing a write conflict?

I'm not familiar with any mechanism. I know we already struggle to return users an appropriate error message when violating a unique constraint for index keys in the presence of a non-default collation. This feels like a similar problem.

[1] All of these prepare conflicts being transformed into a write conflict could* already be a write conflict today if, for example, the vectored insert tries to write to the document before the "prepared operation" goes into a prepared state. If the vectored insert was not part of an MDB transaction, the write conflict will be retried internally. If it were part of an MDB transaction, the transaction would be aborted and the application would have to retry.

Applications that are already resilient to write conflicts in their MDB transactions today should continue to be so after this patch. As noted above, this patch will make the window where applications can see write conflicts change from a CPU timescale to network. Thus it's plausible this could expose retrying bugs in applications that were (un?)lucky enough to not run into them yet.

Comment by Bruce Lucas (Inactive) [ 08/Jun/21 ]

daniel.gottlieb, does this have the possibility of user operations that didn't previously encounter write conflict errors, or were unlikely to, to now possibly see a write conflict? That's not necessarily a problem (unless it makes operations likely to see write conflicts at a high rate), but it may be something we need to be aware of in case customers have questions.

Also, I wonder if there is any mechanism in the error returned to indicate to a user why the are seeing a write conflict? We have seen cases where write conflicts can be generated that catch users by surprise (even though we say they should always be prepared to handle write conflicts).

Comment by Daniel Gottlieb (Inactive) [ 07/Jun/21 ]

For this ticket I plan on failing operations with a WriteConflictException that:

This resolution should eliminate the possibility that any deployments in the wild gets stuck in the way this ticket describes. However, this change won't discover existing potential problems that are not already being exercised. SERVER-57506 is for a change that can identify problematic transactions without requiring they hit a prepare conflict.

Generated at Thu Feb 08 05:41:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.