Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 6.3.0-rc0
Affects Version/s: 4.4.0, 5.0.0, 6.0.0
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Execution Team 2022-11-14, Execution Team 2022-12-12, Execution Team 2022-11-28, Execution Team 2022-12-26, Execution Team 2023-01-09
Linked BF Score:
35
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Consider the following interleaving:

1. [th1] Starts a multi write operation on an unsharded collection and passes the dbVersion check successfully.
2. [th1] The multi write yields.
3. [th2] A movePrimary starts and sets the 'move primary in progress' flag on the DatabaseShardingState.
4. [th2] MovePrimary commits.
5. [th2] MovePrimary unsets the 'move primary in progress'
6. [th2] MovePrimary hangs before dropping the old collection from the former primary.
7. [th1] The multi write now resumes from the yield. Note how the writes never recheck the dbVersion on the op_observers (like we do for the 'shardVersion'). Therefore, the writes don't fail, but are lost because they happened on the old db-primary shard after the ownership change had already committed.

Note that on the op_observer we fail writes if there's a move primary operation in progress. This is what typically prevents losing writes. However, this is only hit if the write restores from yield while the movePrimary is still in progress. In the interleaving above, this does not happen. I don't think this interleaving is very likely, since it requires that the write was yielded for a long time (from the moment the cloning started until after the commit happened).

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

0001-Repro-SERVER-70437.patch
5 kB
Oct 11 2022 09:52:01 AM UTC

causes

SERVER-80463 MigrationChunkClonerSourceOpObserver::onInserts() written to look like it skips checking some documents for whether their chunk has moved

Closed

Assignee:: Daniel Gomez Ferro
Reporter:: Jordi Serra Torrens
Participants:: Daniel Gomez Ferro, Githook User, Jordi Serra Torrens
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Oct 11 2022 09:50:29 AM UTC
Updated:: Oct 29 2023 09:32:04 PM UTC
Resolved:: Jan 03 2023 02:09:02 PM UTC
Confidence Status Last Update:: 07/Nov/22 9:20 AM

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates