[SERVER-78050] Chunk Migration Can Lose Data If Processing Deferred Modifications Created: 13/Jun/23  Updated: 29/Oct/23  Resolved: 15/Jun/23

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.4.19, 5.0.15, 6.3.0-rc0, 6.0.5, 7.1.0-rc0, 7.0.0-rc4
Fix Version/s: 7.1.0-rc0, 6.0.7, 5.0.19, 4.4.23, 7.0.0-rc4

Type: Bug Priority: Critical - P2
Reporter: Brett Nawrocki Assignee: Brett Nawrocki
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-71219 Migration can miss writes from prepar... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0
Sprint: Sharding NYC 2023-06-26
Participants:
Linked BF Score: 135

 Description   

If a update is performed in a transaction, we can't immediately read back the document if we don't already have the post-image, so we instead insert it into a list of deferred updates. These deferred updates will be processed in nextModBatch by reading the latest version of the document when the recipient shard calls _transferMods.

Next, the ids of the documents that have changed will be pulled from the update list, and those documents will be read in order to get the latest state to transfer to the recipient.

However, if we have already read documents while processing the deferred updates, a snapshot will be opened and pinned to the operation context, continuing to be used when we read later to get the latest state of the documents in the update list.

This means that it is possible for an update to read from a stale snapshot during the following sequence of events:

  1. Deferred updates are processed, a snapshot is opened
  2. Another thread updates a document in the chunk being moved, adding its id to the update list
  3. The updates list is spliced in nextModsBatch
  4. The state of the documents in the updates list is read using the same snapshot as in step 1, prior to the update being made in step 2
  5. The update is lost

Calling abandonSnapshot on the operation context's recovery unit after splicing the update list should be sufficient to ensure that we will read from a snapshot at least as recent as the updates in the list, though it's not clear if this is the best long term solution to the problem.



 Comments   
Comment by Githook User [ 16/Jun/23 ]

Author:

{'name': 'Brett Nawrocki', 'email': 'brett.nawrocki@mongodb.com', 'username': 'brettnawrocki'}

Message: SERVER-78050 Ensure _transferMods reads from latest snapshot

(cherry picked from commit e015af264b295b9266b0ddf0197cb4da3c3fa0fd)
Branch: v5.0
https://github.com/mongodb/mongo/commit/916bc197e430da7bc73fcfe20f9a712b253d7558

Comment by Githook User [ 16/Jun/23 ]

Author:

{'name': 'Brett Nawrocki', 'email': 'brett.nawrocki@mongodb.com', 'username': 'brettnawrocki'}

Message: SERVER-78050 Ensure _transferMods reads from latest snapshot

(cherry picked from commit e015af264b295b9266b0ddf0197cb4da3c3fa0fd)
Branch: v4.4
https://github.com/mongodb/mongo/commit/173959ef72626005fe213f606d9e6e8fa26843b2

Comment by Githook User [ 15/Jun/23 ]

Author:

{'name': 'Brett Nawrocki', 'email': 'brett.nawrocki@mongodb.com', 'username': 'brettnawrocki'}

Message: SERVER-78050 Ensure _transferMods reads from latest snapshot

(cherry picked from commit e015af264b295b9266b0ddf0197cb4da3c3fa0fd)
Branch: v6.0
https://github.com/mongodb/mongo/commit/9baeeaf3792c8d656cfab7df3035d467ecd2ebad

Comment by Githook User [ 15/Jun/23 ]

Author:

{'name': 'Brett Nawrocki', 'email': 'brett.nawrocki@mongodb.com', 'username': 'brettnawrocki'}

Message: SERVER-78050 Ensure _transferMods reads from latest snapshot

(cherry picked from commit e015af264b295b9266b0ddf0197cb4da3c3fa0fd)
Branch: v7.0
https://github.com/mongodb/mongo/commit/634d301f10ef860dc6de0dacb92a23751c3d6a12

Comment by Githook User [ 15/Jun/23 ]

Author:

{'name': 'Brett Nawrocki', 'email': 'brett.nawrocki@mongodb.com', 'username': 'brettnawrocki'}

Message: SERVER-78050 Ensure _transferMods reads from latest snapshot
Branch: master
https://github.com/mongodb/mongo/commit/e015af264b295b9266b0ddf0197cb4da3c3fa0fd

Generated at Thu Feb 08 06:37:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.