Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.0.0-rc1, 4.1.1
Affects Version/s: None
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.0
Sprint:
Sharding 2018-06-04
Linked BF Score:
30
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Currently, the config server invalidates its cached metadata for a collection after it hears a response to moveChunk from the source shard in a migration. However, the recipient shard in that migration is free to begin a new migration as soon as it completes _recvChunkCommit (called earlier during the critical section). If a request to move a chunk away from the recipient shard comes to the config server before it invalidates its cache, it will send a stale shard version to the recipient and the recipient may complete the forced refresh at the beginning of a migration without seeing the persisted metadata changes, so it will not know it has received a new chunk, but begin to drive a migration anyway.

This is only a problem when the shard believes it only owns one chunk, because when it goes to commit the migration it will not send a control chunk, so its shard version will not change after the migration commits, because the version of the chunk it doesn't know it will still own won't be bumped. Then the shard will be able to accept reads from a stale mongos looking for the chunk that was just moved even after refreshing at the end of moveChunk, returning wrong results.

I think a partial solution would be to have the config server invalidate its cached metadata during _configsvrCommitChunkMigration, to decrease the likelihood of this happening, but a full solution may require preventing the recipient shard in a migration from driving a new one until the migration it was a part of has completed.

is related to

SERVER-35209 Remove unused controlChunk parameter in _configsvrCommitChunkMigration

Closed

Assignee:: Randolph Tan
Reporter:: Jack Mulrow
Participants:: Githook User, Gregory McKeon, Jack Mulrow, Kaloian Manassiev, Randolph Tan
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: May 08 2018 05:13:45 PM UTC
Updated:: Oct 29 2023 10:31:59 PM UTC
Resolved:: May 24 2018 06:20:01 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates