[SERVER-30530] applyOps triggers invariant on WCE while applying upsert operations atomically Created: 07/Aug/17  Updated: 30/Oct/23  Resolved: 20/Sep/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.5.11
Fix Version/s: 3.6.0-rc0

Type: Bug Priority: Major - P3
Reporter: Benety Goh Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-30049 applyOperation_inlock() allows except... Closed
is related to SERVER-31087 adorn secondary updates with timestamps Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2017-09-11, Storage 2017-10-02
Participants:

 Description   

The update query executor runs the document insert inside a WriteUnitOfWork. When run atomically, the applyOps command also wraps lower level function calls (such as upsert) in a WriteUnitOfWork This may lead to an invariant when a WriteConflictException is thrown while inserting the document as part of an upsert.

To reproduce this issue, run the following JS test code using resmoke.py with --repeat=100:

    load("jstests/replsets/libs/apply_ops_insert_write_conflict.js");
 
    new ApplyOpsInsertWriteConflictTest({
        testName: 'apply_ops_insert_write_conflict_atomic',
        atomic: true
    }).run();



 Comments   
Comment by Githook User [ 20/Sep/17 ]

Author:

{'email': 'benety@mongodb.com', 'name': 'Benety Goh', 'username': 'benety'}

Message: SERVER-30530 add js test for write conflict handling in applyOps for single inserts (applied atomically)
Branch: master
https://github.com/mongodb/mongo/commit/aa3b85fd363e77f7fc1f1c3623422a61d7f70ed7

Comment by Githook User [ 19/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30530 WCE retry loop runs argument function without retry logic if in a WUOW

If there's already a WUOW, we assume that we are already running in a WCE retry up the
call stack.
Branch: master
https://github.com/mongodb/mongo/commit/0e2b48d8dd77f972449b89869cdbfabdbc074845

Comment by Eric Milkie [ 18/Sep/17 ]

I believe the best way forward here is to do option 3, to modify the writeConflictRetry wrapper to re-throw WCE if it detects that it is nested in a WUOW.

Comment by David Storch [ 13/Sep/17 ]

Notes from my conversation about this problem and potential fixes with milkie and daniel.gottlieb. As far as we know, this problem is unique to the applyOps command with the allowAtomic flag set to true. This causes the applyOps code path to wrap the ops in a WriteUnitOfWork. When it elects to use the query system in order to actually perform these writes, the query system creates a nested WriteUnitOfWork, with a write conflict retry loop. Nested WriteUnitOfWorks}}s are allowed, but the inner one doesn't actually do anything; the transaction won't rollback or commit until the outer {{WriteUnitOfWork rolls back or commits. A write conflict retry loop on an inner WUOW is therefore an error, since write conflict exceptions need to be thrown up to the outer WUOW.

There are a number ways that we might approach fixing this problem:

  • Something like SERVER-28910, where we avoid using the query system entirely in the apply ops path. This would sidestep the issue by avoiding a nested WUOW.
  • Modify the code in update.cpp to avoid the write-conflict retry loop when it is already inside an outer WUOW.
  • Modify the writeConflictRetry() to re-throw the WriteConflictException if there is an outer WUOW.
  • Remove support for atomic mode in applyOps, thereby eliminating the outer WUOW. This was once needed for sharding, but may no longer be necessary.
Comment by Ian Whalen (Inactive) [ 05/Sep/17 ]

Currently assigned to Dave to look into what's going on.

Generated at Thu Feb 08 04:24:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.