[SERVER-76720] Chunk Migration migrates the session history for the migrating session leading to a deadlock Created: 01/May/23  Updated: 29/Oct/23  Resolved: 08/May/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0, 7.0.0-rc1

Type: Bug Priority: Major - P3
Reporter: Rachita Dhawan Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-77332 Reevaulate migration destination sess... Backlog
is related to SERVER-76836 setAllowMigrations is executing remot... Closed
Assigned Teams:
Sharding NYC
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0
Sprint: Sharding NYC 2023-05-15
Participants:
Linked BF Score: 145

 Description   

The chunkMigration protocol always migrates the oplog entries for $incompleteOplogHistory noop oplog entry irrespective of if:-
1. Its from the same namespace that it is migrating
2. Belongs to the chunk Range

Steps while migrating oplogs
1. The SessionCatalogMigrationSource tries to traverse the oplog chain for every session that exists on the donor
2. The SessionCatalogMigrationSource generates a $incompleteOplogHistory noop oplog entry for every transaction that it cannot find the next oplog entry for (here), for example, because of oplog truncation.
3. The SessionCatalogMigrationSource doesn't filter out $incompleteOplogHistory noop oplog entries (i.e. the namespace and chunk range check doesn't apply).

The Receiving shard always checksout the session it is migrating.

And then, as part of migrating oplogs, we also checkout the session that is being migrated and since the session is already checkedout, it causes a server hang here

This seems to be violating the assumption that chunk migration protocol could never migrate session it is migrating (due to how incomplete session history is handled)



 Comments   
Comment by Githook User [ 08/May/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-76720 Don't send session tied to the current moveChunk during session migration

(cherry picked from commit 22707c4f3a2c26529d5487e4d036aa3cbcb3ff2e)
Branch: v7.0
https://github.com/mongodb/mongo/commit/1c5aac135c035b7a7278a2dc49b44aa371e1cd1b

Comment by Githook User [ 05/May/23 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-76720 Don't send session tied to the current moveChunk during session migration
Branch: master
https://github.com/mongodb/mongo/commit/22707c4f3a2c26529d5487e4d036aa3cbcb3ff2e

Generated at Thu Feb 08 06:33:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.