[SERVER-77506] Sharded multi-document transactions can mismatch data and ShardVersion Created: 26/May/23  Updated: 16/Jan/24  Resolved: 05/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.0, 4.4.0, 5.0.0, 6.0.0, 7.0.0-rc0
Fix Version/s: 7.2.0-rc0, 7.0.3, 6.0.13, 5.0.24, 4.4.28

Type: Bug Priority: Critical - P2
Reporter: Jordi Serra Torrens Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: not-7.0-blocker, shardingemea-qw
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File repro.patch    
Issue Links:
Backports
Depends
is depended on by SERVER-78952 Revert SERVER-78855 after SERVER-77506 Closed
Problem/Incident
causes SERVER-80279 Commit on non-existing transaction th... Closed
Related
related to SERVER-79193 Investigate if workload added in TIG-... Backlog
related to SERVER-80525 Investigate ClusterBulkWriteCmd not c... Backlog
related to SERVER-80523 Expose OperationSessionInfoFromClient... Backlog
related to SERVER-80524 Move OperationSessionInfo to oplog id... Backlog
related to SERVER-80526 Refactor appendFieldsForStartTransaction Backlog
is related to SERVER-84723 Sharded multi-document transactions c... Closed
is related to SERVER-82353 Multi-document transactions can miss ... Closed
is related to SERVER-78855 Use snapshot isolation for Queryable ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0, v5.0, v4.4
Steps To Reproduce:

repro.patch

Sprint: Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26, Sharding EMEA 2023-07-10, Sharding EMEA 2023-07-24
Participants:
Linked BF Score: 135
Story Points: 2

 Description   

The second and following statements of a multi-document transaction can operate on an already opened storage engine snapshot whose data doesn't match the ShardVersion indicated by the router.

e.g:

Consider a sharded multi-document transactions with read concern 'local' or 'majority'. The first statement targets collectionA which exists on shard0. This opens a storage engine snapshot on shard0 at T100. 

At T100, shard0 owned half the range of collectionB. Later, a chunk migration happens and shard0 becomes owner of the whole range and its ShardVersion is SV2.

A second statement of the transaction will target collectionB. The router routes according to the post-migration placement (that shard0 owns the whole range) thus it will only target shard0 with SV2.

The shard will check the SV and see that it matches. However, the storage snapshot at which the transaction is operating does not include the documents for the newly received chunk. So the query will miss documents.



 Comments   
Comment by Githook User [ 29/Dec/23 ]

Author:

{'name': 'Jason Zhang', 'email': 'jason.zhang@mongodb.com', 'username': 'jz1242'}

Message: SERVER-77506 Add pessimistic checking to verify that no chunk has moved for the collection being referenced since transaction started

(cherry picked from commit 412d6c94da465c5abea1e35febb256caa0f48b5f)
(cherry picked from commit 430a5258a1ba82ba046dd22f6710b8d9866f0d19)

GitOrigin-RevId: 0e665408cd6b6eddb1e2537ceb3f4bd866bca8ff
Branch: v4.4
https://github.com/mongodb/mongo/commit/59c13b2f33e5a508f0abcfc6eca2d750d06f85e4

Comment by Githook User [ 29/Dec/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion

(cherry picked from commit def5910da36a07e47e7a1c6387c7b7cc57ee5297)
(cherry picked from commit e0f52e1ee6a5525a6282f02d06675ba3b81c1048)

GitOrigin-RevId: 687a32b316c935e647591e08d359e77f362c1988
Branch: v4.4
https://github.com/mongodb/mongo/commit/a226641cb6812a5c7dae7d38cb6da08f8e33be85

Comment by Githook User [ 26/Dec/23 ]

Author:

{'name': 'Jason Zhang', 'email': 'jason.zhang@mongodb.com', 'username': 'jz1242'}

Message: SERVER-77506 Add pessimistic checking to verify that no chunk has moved for the collection being referenced since transaction started

(cherry picked from commit 412d6c94da465c5abea1e35febb256caa0f48b5f)

GitOrigin-RevId: 430a5258a1ba82ba046dd22f6710b8d9866f0d19
Branch: v5.0
https://github.com/mongodb/mongo/commit/995adeb9ad567c953064fe72eb79dd990915485c

Comment by Githook User [ 26/Dec/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion

(cherry picked from commit def5910da36a07e47e7a1c6387c7b7cc57ee5297)

GitOrigin-RevId: e0f52e1ee6a5525a6282f02d06675ba3b81c1048
Branch: v5.0
https://github.com/mongodb/mongo/commit/d50cceaa473029c375b9bf036d73c76ca0eeecee

Comment by Githook User [ 21/Dec/23 ]

Author:

{'name': 'Jason Zhang', 'email': 'jason.zhang@mongodb.com', 'username': 'jz1242'}

Message: SERVER-77506 Add pessimistic checking to verify that no chunk has moved for the collection being referenced since transaction started

GitOrigin-RevId: 412d6c94da465c5abea1e35febb256caa0f48b5f
Branch: v6.0
https://github.com/mongodb/mongo/commit/914844e7fd8622edf4dbdef71a7f232a0bcb4932

Comment by Githook User [ 21/Dec/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion

(cherry picked from commit da006e938e12bd81f1fc2a6a5a86b7b177dfe66f)

GitOrigin-RevId: def5910da36a07e47e7a1c6387c7b7cc57ee5297
Branch: v6.0
https://github.com/mongodb/mongo/commit/2c8d88578b4b4ea55ad1130182dca5e5bbddacea

Comment by Githook User [ 05/Oct/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Add pessimistic checking to verify that no chunk has moved for the collection being referenced since transaction started

(cherry picked from commit 4086290c7cee228b9bf53ec0ecc6c7bc48f7e65b)

Co-authored-by: Kaloian Manassiev <kaloian.manassiev@mongodb.com>

This additionally includes the changes to transaction_router.h,
transaction_router.cpp, and cluster_commands_helpers.cpp from commit
21bebca30d1682c29e88cf8c520f0793b13e0431.
Branch: v7.0
https://github.com/mongodb/mongo/commit/f9886d6dbfec4bf149160d687f92272980bcd589

Comment by Githook User [ 05/Oct/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion

(cherry picked from commit da006e938e12bd81f1fc2a6a5a86b7b177dfe66f)
Branch: v7.0
https://github.com/mongodb/mongo/commit/4e3bdd5c03b0083df672f4dab6a7c05c4d26884b

Comment by Githook User [ 28/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Remove idl changes that will no longer be needed
Branch: v7.0
https://github.com/mongodb/mongo/commit/b8b7da6e42e40ffd4f23e84c6d8c2c470a60fff9

Comment by Githook User [ 29/Aug/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Replaced todo with new server ticket numbers
Branch: master
https://github.com/mongodb/mongo/commit/9b85dd2a7feac4e21f41ebbd5e556d3fcda8d707

Comment by Randolph Tan [ 16/Aug/23 ]

Assigning back to kaloian.manassiev@mongodb.com to cleanup the TODOs

Comment by Githook User [ 15/Aug/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Add pessimistic checking to verify that no chunk has moved for the collection being referenced since transaction started
Branch: master
https://github.com/mongodb/mongo/commit/4086290c7cee228b9bf53ec0ecc6c7bc48f7e65b

Comment by Githook User [ 25/Jul/23 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: Revert "SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion"

This reverts commit 268b1ed64cc650a43f321bd484c986808d6a2f7f.
Branch: v7.0
https://github.com/mongodb/mongo/commit/0370db4cce82297b0b442d370bb7c87c4c8a64ac

Comment by Githook User [ 25/Jul/23 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: Revert "SERVER-77506 Require OperationSessionInfoFromClient to be constructed with min required arguments"

This reverts commit bb4f2cb84103350963eb4f535e80bdbdbb435b04.
Branch: v7.0
https://github.com/mongodb/mongo/commit/7b26fd264d15318da20d3c1f0fa728307406019b

Comment by Githook User [ 25/Jul/23 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: Revert "SERVER-77506 Rewrite TransactionRouter::appendFieldsForStartTransaction in a more linear form"

This reverts commit a732e558898abdf042d73c8a59c2ca4e53779617.
Branch: v7.0
https://github.com/mongodb/mongo/commit/7b0f219d87c738b7286d17cba53ef1d418640551

Comment by Githook User [ 15/Jul/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Move placementConflictTime from OperationSessionInfo to OperationSessionInfoFromClientBase

(cherry picked from commit dc5e082d50298c90f3cadffa73648ca104515e2b)
Branch: v7.0
https://github.com/mongodb/mongo/commit/6fdbe21f6eadd3ee2114fa600288e1c5287aa1f8

Comment by Githook User [ 15/Jul/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Move placementConflictTime from OperationSessionInfo to OperationSessionInfoFromClientBase
Branch: master
https://github.com/mongodb/mongo/commit/dc5e082d50298c90f3cadffa73648ca104515e2b

Comment by Githook User [ 14/Jul/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Add idl for placementConflictTime

(cherry picked from commit 33a995522e187082ae6359c3335a165049c40314)
Branch: v7.0
https://github.com/mongodb/mongo/commit/b78c61fedfd490edcd1cf839fcde777e396b819f

Comment by Githook User [ 14/Jul/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-77506 Add idl for placementConflictTime
Branch: master
https://github.com/mongodb/mongo/commit/33a995522e187082ae6359c3335a165049c40314

Comment by Tyler Brock [ 11/Jul/23 ]

Removing the 7.0-required and rapid-response labels because we're able to mitigate the increased risk of customers encountering this issue in 7.0 with SERVER-78855

Comment by Githook User [ 08/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Rewrite TransactionRouter::appendFieldsForStartTransaction in a more linear form

(cherry picked from commit 21bebca30d1682c29e88cf8c520f0793b13e0431)
Branch: v7.0
https://github.com/mongodb/mongo/commit/a732e558898abdf042d73c8a59c2ca4e53779617

Comment by Githook User [ 07/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Require OperationSessionInfoFromClient to be constructed with min required arguments

(cherry picked from commit 1e0d479c37c6285cf7d1922413fbcd4e0b83acf8)
Branch: v7.0
https://github.com/mongodb/mongo/commit/bb4f2cb84103350963eb4f535e80bdbdbb435b04

Comment by Githook User [ 07/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Rewrite TransactionRouter::appendFieldsForStartTransaction in a more linear form
Branch: master
https://github.com/mongodb/mongo/commit/21bebca30d1682c29e88cf8c520f0793b13e0431

Comment by Githook User [ 07/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion

(cherry picked from commit da006e938e12bd81f1fc2a6a5a86b7b177dfe66f)
Branch: v7.0
https://github.com/mongodb/mongo/commit/268b1ed64cc650a43f321bd484c986808d6a2f7f

Comment by Githook User [ 04/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Require OperationSessionInfoFromClient to be constructed with min required arguments
Branch: master
https://github.com/mongodb/mongo/commit/1e0d479c37c6285cf7d1922413fbcd4e0b83acf8

Comment by Githook User [ 03/Jul/23 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-77506 Expose the maxValidAfter timestamp alongside the shardVersion
Branch: master
https://github.com/mongodb/mongo/commit/da006e938e12bd81f1fc2a6a5a86b7b177dfe66f

Comment by PM Bot [ 13/Jun/23 ]

This issue has been flagged for rapid response!

Assignees of rapid response tickets are responsible for providing a daily update on this issue using the 'Server Rapid Response' canned comment template.

Any questions about this ticket can be directed to the #server-rapid-response Slack channel and more information on the Server Rapid Response process can be found on the Wiki 

Generated at Thu Feb 08 06:35:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.