[SERVER-80236] Race in migration source registration and capturing writes for xferMods for deletes Created: 18/Aug/23  Updated: 29/Oct/23  Resolved: 31/Aug/23

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.2.0, 4.4.0, 5.0.0, 6.0.0, 7.0.0
Fix Version/s: 7.2.0-rc0, 7.0.2, 7.1.0-rc1, 5.0.22, 6.0.11, 4.4.26

Type: Bug Priority: Critical - P2
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File test.js    
Issue Links:
Backports
Depends
Related
related to SERVER-80680 remove no-op MigrationChunkClonerSour... Closed
is related to SERVER-38284 Remove donor collection X-lock acquis... Closed
Assigned Teams:
Sharding NYC
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0, v5.0, v4.4
Participants:
Linked BF Score: 108

 Description   

The migration cloner installs itself to the CollectionShardingRuntime while holding the csr lock in exclusive mode. The op observers take the csr lock in shared mode to extract the cloner and capture the write. However, it completely skips this if it cannot find the cloner. Therefore the following scenario is possible:

  1. Doc A is inserted.
  2. ThreadA starts a wuow and performs delete on doc A.
  3. Delete tries to check for cloner and finds out that it is not attached to the csr, so it decides not to capture the write.
  4. Migration starts and installs cloner to csr.
  5. Migration clone scans through the collection, clones doc A and gets sent to the recipient shard.
  6. ThreadA commits wuow.
  7. Migration finishes and write done by threadA is not transfered.


 Comments   
Comment by Githook User [ 28/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback

(cherry picked from commit f343f8dd0efbd885aa1db8a26de7018a84345689)
Branch: v4.4
https://github.com/mongodb/mongo/commit/ae14c10e4bee34063c3211758ee2eb876fce9259

Comment by Githook User [ 14/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback

(cherry picked from commit c6961408c5fba8783acf6c5eb507ccc769c69c05)
Branch: v5.0
https://github.com/mongodb/mongo/commit/f343f8dd0efbd885aa1db8a26de7018a84345689

Comment by Githook User [ 14/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback

(cherry picked from commit dcac81ff8729972a057f42d2b074889524d62467)
Branch: v6.0
https://github.com/mongodb/mongo/commit/c6961408c5fba8783acf6c5eb507ccc769c69c05

Comment by Githook User [ 06/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback

(cherry picked from commit 1c690ead56668593cb741aba0a78ba212df74fd1)
Branch: v7.0
https://github.com/mongodb/mongo/commit/dcac81ff8729972a057f42d2b074889524d62467

Comment by Githook User [ 05/Sep/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback

(cherry picked from commit 1c690ead56668593cb741aba0a78ba212df74fd1)
Branch: v7.1
https://github.com/mongodb/mongo/commit/63397436024403c6aa46d2a122e80884244e8444

Comment by Githook User [ 30/Aug/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-80236 Move all xferMods logic to the recovery unit onCommit callback
Branch: master
https://github.com/mongodb/mongo/commit/1c690ead56668593cb741aba0a78ba212df74fd1

Comment by Randolph Tan [ 18/Aug/23 ]

It looks like this only affects deletes. For update and inserts, the migration op observer gets called after the oplog time slots have already been reserved. So even if the write was not captured, the cloner will be able to see the writes because it waits for replication here and ends up waiting for the hole. The issue with deletes is that it has aboutToDelete, which saves the decision to skip capturing the write and can happen before any opTime is reserved, so the cloner will not wait for the delete even if the delete decided ahead of time to skip capturing the op.

Generated at Thu Feb 08 06:43:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.