[SERVER-84761] MigrationSourceManager may fail to emit the migrateChunkToNewShard due to stale ChunkManager info Created: 11/Jan/24  Updated: 06/Feb/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: 5.3.0, 3.6.0, 4.0.0, 4.2.0, 4.4.0, 5.0.0, 5.2.0, 5.1.0, 6.0.0, 6.1.0, 6.2.0-rc0, 6.3.0, 7.0.0, 7.1.0, 7.3.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Paolo Polato Assignee: Paolo Polato
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-74032 MigrationSource manager should emit o... In Progress
is related to SERVER-85914 Inspect all usages of CollectionMetad... Backlog
Assigned Teams:
Catalog and Routing
Operating System: ALL
Sprint: CAR Team 2024-02-05, CAR Team 2024-02-19
Participants:
Story Points: 2

 Description   

The scenario may be reproduced through the following sequence:

1. A migration with "nss: collName, from: shardId1, to: shardId2" starts; the MigrationSourceManager gets instantiated on shardId1, and placement information gets retrieved; at the time of the retrieval, shardId2 owns 1 collection chunk
2. A concurrent migration with "nss: collName, from: shardId2, to: anotherShardId" gets concurrently committed; shardId2 loses its last chunk (and the related op entry gets emitted)
3. The migration started on step 1 is resumed; since shardId2 reacquires its first collection chunk, a migrateChunkToNewShard should be emitted - but the placement information is stale at the time it gets evaluated


Generated at Thu Feb 08 06:55:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.