[SERVER-42783] Migrations don't wait for majority replication of cloned documents if there are no transfer mods Created: 12/Aug/19  Updated: 29/Oct/23  Resolved: 22/Aug/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.15, 3.6.4, 4.0.0, 4.2.0
Fix Version/s: 3.6.15, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Jack Mulrow Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2, v4.0, v3.6, v3.4
Steps To Reproduce:

Run the following test that shows majority committed documents may not be in the recipient shard's majority committed snapshot after a migration:

(function() {
"use strict";
 
// Set up a sharded cluster with two shards, two chunks, and one document in one of the chunks.
const st = new ShardingTest({shards: 2, rs: {nodes: 2}, config: 1});
const testDB = st.s.getDB("test");
 
assert.commandWorked(testDB.foo.insert({_id: 1}, {writeConcern: {w: "majority"}}));
 
st.ensurePrimaryShard("test", st.shard0.shardName);
assert.commandWorked(st.s.adminCommand({enableSharding: "test"}));
assert.commandWorked(st.s.adminCommand({shardCollection: "test.foo", key: {_id: 1}}));
assert.commandWorked(st.s.adminCommand({split: "test.foo", middle: {_id: 0}}));
 
// The document is in the majority committed snapshot.
assert.eq(1, testDB.foo.find().readConcern("majority").itcount());
 
// Advance a migration to the beginning of the cloning phase.
assert.commandWorked(st.rs1.getPrimary().adminCommand(
    {configureFailPoint: 'migrateThreadHangAtStep2', mode: "alwaysOn"}));
TestData.toShard = st.shard1.shardName;
const awaitMigration = startParallelShell(() => {
    jsTestLog("moving to " + TestData.toShard);
    assert.commandWorked(
        db.adminCommand({moveChunk: "test.foo", find: {_id: 1}, to: TestData.toShard}));
}, st.s.port);
 
// Sleep to let the migration reach the failpoint and allow any writes to become majority committed
// before pausing replication.
sleep(3000);
st.rs1.awaitLastOpCommitted();
 
// Disable replication on the recipient shard's secondary node, so the recipient shard's majority
// commit point cannot advance.
const destinationSec = st.rs1.getSecondary();
assert.commandWorked(
    destinationSec.adminCommand({configureFailPoint: "rsSyncApplyStop", mode: "alwaysOn"}),
    "failed to enable fail point on secondary");
 
// Allow the migration to begin cloning.
assert.commandWorked(st.rs1.getPrimary().adminCommand(
    {configureFailPoint: 'migrateThreadHangAtStep2', mode: "off"}));
 
// The migration should finish cloning and commit without being able to advance the majority commit
// point.
awaitMigration();
 
// The document is not in the recipient's majority committed snapshot, so the cloned document cannot
// be found by a majority read and this assertion fails.
assert.eq(
    1, testDB.foo.find().readConcern("majority").itcount(), "cloned doc not in majority snapshot!");
 
assert.commandWorked(
    destinationSec.adminCommand({configureFailPoint: "rsSyncApplyStop", mode: "off"}),
    "failed to disable fail point on secondary");
 
st.stop();
})();

Sprint: Sharding 2019-08-26
Participants:
Linked BF Score: 23

 Description   

Before a chunk migration can commit, the shard receiving the chunk is supposed to wait for every document in the chunk to be majority committed to ensure no data can be lost if there's a failover. Currently, the recipient shard will wait for the maximum of the last opTime on the client of the thread driving the migration on the recipient at the end of the cloning phase and the opTime of the latest transfer mod write to be majority committed.

SERVER-32885 changed migration recipients to insert documents received in the initial cloning phase on a separate thread, so the driving thread's last opTime at the end of the cloning phase will not be the opTime of the insert of the latest cloned document. If there are no transfer mod writes (the opTimes of which are correctly tracked and are guaranteed to be > the opTime of the inserts of any cloned documents), then the migration can commit without waiting for all cloned documents to be majority replicated.

This has at least the following implications for migrations that involve no transfer mods:

  • Majority reads without causal consistency may fail to read documents inserted on the same session with majority write concern even without failovers (like in the attached repro).
  • A recipient shard stepdown after the migration commits may lead to a rollback of previously majority committed documents.

SERVER-32885 was backported to 3.4, so this may be a problem from that release on, although I haven't manually verified this.



 Comments   
Comment by Githook User [ 08/Oct/19 ]

Author:

{'name': 'Janna Golden', 'username': 'jannaerin', 'email': 'janna.golden@mongodb.com'}

Message: SERVER-42783 Migrations should wait for majority replication of cloned docs when there are no xfer mods

(cherry picked from commit 2fb73bcd2515cd8d566fecc5b23ee9f6970b1716)
Branch: v3.6
https://github.com/mongodb/mongo/commit/8ac53a6a615f14c78cf007ae6a58688849b63f56

Comment by Githook User [ 07/Oct/19 ]

Author:

{'username': 'jannaerin', 'email': 'janna.golden@mongodb.com', 'name': 'Janna Golden'}

Message: SERVER-42783 Migrations should wait for majority replication of cloned docs when there are no xfer mods

(cherry picked from commit 2fb73bcd2515cd8d566fecc5b23ee9f6970b1716)
Branch: v4.0
https://github.com/mongodb/mongo/commit/675428d8399a7bb525ffcedf67e784603ae02aa2

Comment by Githook User [ 07/Oct/19 ]

Author:

{'username': 'jannaerin', 'email': 'janna.golden@mongodb.com', 'name': 'Janna Golden'}

Message: SERVER-42783 Migrations should wait for majority replication of cloned docs when there are no xfer mods

(cherry picked from commit 2fb73bcd2515cd8d566fecc5b23ee9f6970b1716)
Branch: v4.2
https://github.com/mongodb/mongo/commit/19007f085afb46b0c56329d2230667f789b3f13c

Comment by Githook User [ 22/Aug/19 ]

Author:

{'username': 'jannaerin', 'email': 'golden.janna@gmail.com', 'name': 'jannaerin'}

Message: SERVER-42783 Migrations should wait for majority replication of cloned docs when there are no xfer mods
Branch: master
https://github.com/mongodb/mongo/commit/2fb73bcd2515cd8d566fecc5b23ee9f6970b1716

Generated at Thu Feb 08 05:01:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.