[SERVER-34919] Write conflict between batched inserts within transactions incorrectly throws DuplicateKey error Created: 09/May/18  Updated: 29/Oct/23  Resolved: 29/May/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.0.0-rc1, 4.1.1

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Eric Milkie
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File batch_insert_write_conflict.js    
Issue Links:
Backports
Related
is related to SERVER-34648 Write test for all types of transacti... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.0
Sprint: Storage NYC 2018-06-04
Participants:

 Description   

When running two concurrent, multi-document transactions that try to insert document sets that have a non-empty intersection, one of the transactions should fail with a WriteConflict, since both transactions try to write to the same document. Instead, it appears that a DuplicateKey error is instead being thrown. See attached repro.

This is the transaction history being tested in the repro script:

Start(T1), Start(T2), Insert(T2, [d1, d2]), Commit(T2), Insert(T1, [d2, d3]), Commit(T1)

The first insert by T1 throws a DuplicateKey error. What is additionally odd is that the DuplicateKey error appears to be thrown on _id: 1, when we would expect the WriteConflict error to be thrown on _id: 2

e.g.

"errmsg" : "E11000 duplicate key error collection: test.transactions_write_conflicts index: _id_ dup key: { : 1.0 }",



 Comments   
Comment by Githook User [ 29/May/18 ]

Author:

{'username': 'milkie', 'name': 'Eric Milkie', 'email': 'milkie@10gen.com'}

Message: SERVER-34919 do not retry vector insert on write conflict exception, for read concern snapshot

(cherry picked from commit 3312ff09502ceb92d93f65f92d4e823df993a927)
Branch: v4.0
https://github.com/mongodb/mongo/commit/327d3efeda9837bad91e8920a8b35e9bb31c7598

Comment by Githook User [ 29/May/18 ]

Author:

{'username': 'milkie', 'name': 'Eric Milkie', 'email': 'milkie@10gen.com'}

Message: SERVER-34919 do not retry vector insert on write conflict exception, for read concern snapshot
Branch: master
https://github.com/mongodb/mongo/commit/3312ff09502ceb92d93f65f92d4e823df993a927

Comment by Eric Milkie [ 24/May/18 ]

I figured this out. It has to do with the way we handle WCE when doing a vectored insert. On WCE, instead of retrying the entire vector insert, we instead devolve into doing individual inserts, and retrying individual inserts if we get subsequent WCE's.
This logic doesn't work when in a multi-document transaction, because the code assumes the vectored insert storage transaction is aborted on WCE, when in fact it is not. I'll work on fixing this.

Comment by William Schultz (Inactive) [ 11/May/18 ]

Here are a few additional data points, including a more minimal test case. If we replace t1Op and t2Op in the repro we get the following results:

Case 1

t1Op = {insert: collName, documents: [{_id: 1}, {_id: 2}]};
t2Op = {insert: collName, documents: [{_id: 2}]};

Result:

"ok" : 0,
"errmsg" : "E11000 duplicate key error collection: test.transactions_write_conflicts index: _id_ dup key: { : 1.0 }",
"code" : 11000,
"codeName" : "DuplicateKey",

Case 2

t1Op = {insert: collName, documents: [{_id: 2}, {_id: 1}]};
t2Op = {insert: collName, documents: [{_id: 2}]};

Produces expected WriteConflict correctly.

Comment by William Schultz (Inactive) [ 11/May/18 ]

Ah, yes, that's definitely very odd! Sorry I sort of missed that in the diagnosis. Added it to the description.

Comment by Eric Milkie [ 11/May/18 ]

What's even weirder is the key value the conflict is on (this wasn't shown in the description). When I run Will's repro, the Duplicate Key error indicates the duplicate key is from document d1, not d2!

Comment by Spencer Brody (Inactive) [ 09/May/18 ]

Ah, I see, thanks for clarifying.  That does sound like a bug then, especially if the behavior is different between batch and single-inserts.  This also seems like a storage (or maybe query?) bug, so I'm going to leave this assigned to the storage backlog for now.  milkie, let me know if you think this should be picked up by the repl transactions team.

Comment by William Schultz (Inactive) [ 09/May/18 ]

milkie No. The problem goes away (WriteConflict is thrown instead of a DuplicateKey error) when doing single document inserts, like the example you gave.

Comment by Eric Milkie [ 09/May/18 ]

Does the same problem occur without using vectored inserts? That is, doing this:

Start(T1), Start(T2), Insert(T2, [d2]), Commit(T2), Insert(T1, [d2]), Commit(T1)

I'm curious to know how vectored inserts are involved here.

Comment by William Schultz (Inactive) [ 09/May/18 ]

spencer By "second transaction" I assume you mean T1. Since T1 starts before T2 commits, it should execute against a snapshot that doesn't see any of T2's writes. So why would you expect it to produce a DuplicateKey error when it writes to d2, if d2 doesn't exist at its read timestamp? T1 and T2 are concurrent transactions that both try to write to the same document (d2), which should produce a WriteConflict, as far as I understand it. For reference, this case does produce a WriteConflict when each transaction inserts only a single (conflicting) document. So I don't see why it shouldn't when they both do batch inserts.

Comment by Spencer Brody (Inactive) [ 09/May/18 ]

william.schultz I don't follow why this is wrong.  The second transaction gets a duplicate key error from the duplicate _id value.  That seems reasonable to me.

Generated at Thu Feb 08 04:38:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.