Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 2.5.1
Affects Version/s: 2.4.1
Component/s: Sharding
Labels:
None

Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Note: this can happen only if there are more than one migrations happening in a cluster (for example, when running moveChunk manually).

Setup:
3 shards, 2 sharded collection

Description of race:
1. move 1 chunk from shard1 to shard0.
2. migrate thread performing recvChunk in shard0, fails for some reason and terminates early, setting incoming migration active state to false.
3. move 1 chunk (ideally empty so it will be fast) from shard2 to shard0. This in effect, starts a new migration and changes the state to 'done'.
4. shard1 calls _recvChunkStatus, and totally misses the transition to 'fail' state, and sees the 'done' state from migration at step#3, and it then keeps on looping until some other slow migration begins and change the state to "steady".

Attaching patch that demonstrates this race.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

patch
Apr 11 2013 03:43:58 PM UTC
3 kB
Randolph Tan

Assignee:: Randolph Tan
Reporter:: Randolph Tan
Participants:: auto, Randolph Tan
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Apr 11 2013 03:42:54 PM UTC
Updated:: Jul 11 2016 05:57:22 PM UTC
Resolved:: Jul 08 2013 09:32:55 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates