[SERVER-44953] Secondaries should restart index builds when a commitIndexBuild oplog entry is processed but no index build is active Created: 04/Dec/19  Updated: 29/Oct/23  Resolved: 22/Jan/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.3.3

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File initsync_fuzzer-f628-1580268652754-5.js    
Issue Links:
Related
is related to SERVER-45347 2-phase index build on empty collecti... Closed
is related to SERVER-45382 indexbg_restart_secondary.js can race... Closed
is related to SERVER-45905 abortIndexBuild oplog entry should ig... Closed
is related to SERVER-45921 Index builder invariants on this chec... Closed
is related to SERVER-45933 2 phase index build running with maxT... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Using this failpoint:

diff --git a/src/mongo/db/index_builds_coordinator.cpp b/src/mongo/db/index_builds_coordinator.cpp
index 50d5b1a721..f2ee3c6794 100644
--- a/src/mongo/db/index_builds_coordinator.cpp
+++ b/src/mongo/db/index_builds_coordinator.cpp
@@ -67,6 +67,7 @@ using namespace indexbuildentryhelpers;
 MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildFirstDrain);
 MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildSecondDrain);
 MONGO_FAIL_POINT_DEFINE(hangAfterIndexBuildDumpsInsertsFromBulk);
+MONGO_FAIL_POINT_DEFINE(hangBeforeIndexBuildCleanUp);
 
 namespace {
 
@@ -1344,6 +1345,8 @@ void IndexBuildsCoordinator::_runIndexBuildInner(OperationContext* opCtx,
                             << replState->buildUUID);
     NamespaceString nss = collection->ns();
 
+    hangBeforeIndexBuildCleanUp.pauseWhileSet();
+
     if (status.isOK()) {
         _indexBuildsManager.tearDownIndexBuild(
             opCtx, collection, replState->buildUUID, MultiIndexBlock::kNoopOnCleanUpFn);

This test will fail:

(function() {
"use strict";
 
load('jstests/noPassthrough/libs/index_build.js');
 
const rst = new ReplSetTest({
    nodes: [
        {},
        {},
        {arbiter: true},
    ]
});
const nodes = rst.startSet();
rst.initiate();
 
const primary = rst.getPrimary();
const testDB = primary.getDB('test');
const coll = testDB.getCollection('test');
 
assert.commandWorked(coll.insert({a: 1}));
 
let res = assert.commandWorked(primary.adminCommand(
    {configureFailPoint: 'hangAfterInitializingIndexBuild', mode: 'alwaysOn'}));
const hangAfterInitFailpointTimesEntered = res.count;
 
res = assert.commandWorked(
    primary.adminCommand({configureFailPoint: 'hangBeforeIndexBuildCleanUp', mode: 'alwaysOn'}));
const hangBeforeCleanUpFailpointTimesEntered = res.count;
 
const createIdx = IndexBuildTest.startIndexBuild(primary, coll.getFullName(), {a: 1});
 
try {
    assert.commandWorked(primary.adminCommand({
        waitForFailPoint: "hangAfterInitializingIndexBuild",
        timesEntered: hangAfterInitFailpointTimesEntered + 1,
        maxTimeMS: kDefaultWaitForFailPointTimeout
    }));
 
 
    // When the index build starts, find its op id. This will be the op id of the client
    // connection, not the thread pool task managed by IndexBuildsCoordinatorMongod.
    const filter = {"desc": {$regex: /conn.*/}};
    const opId = IndexBuildTest.waitForIndexBuildToStart(testDB, coll.getName(), 'a_1', filter);
 
    // Kill the index build.
    assert.commandWorked(testDB.killOp(opId));
 
    // Let the index build continue running and abort.
    assert.commandWorked(
        primary.adminCommand({configureFailPoint: 'hangAfterInitializingIndexBuild', mode: 'off'}));
 
    // Wait for the index build to start cleaning up.
    assert.commandWorked(primary.adminCommand({
        waitForFailPoint: "hangBeforeIndexBuildCleanUp",
        timesEntered: hangBeforeCleanUpFailpointTimesEntered + 1,
        maxTimeMS: kDefaultWaitForFailPointTimeout
    }));
 
    // Step down the primary, preventing the index build from generating an abort oplog entry.
    assert.commandWorked(primary.adminCommand({replSetStepDown: 30, force: true}));
} finally {
    assert.commandWorked(
        primary.adminCommand({configureFailPoint: 'hangAfterInitializingIndexBuild', mode: 'off'}));
    // Let the index build finish cleaning up.
    assert.commandWorked(
        primary.adminCommand({configureFailPoint: 'hangBeforeIndexBuildCleanUp', mode: 'off'}));
}
 
const exitCode = createIdx({checkExitSuccess: false});
assert.neq(0, exitCode, 'expected shell to exit abnormally due to index build being terminated');
 
// Wait for the index build to stop.
IndexBuildTest.waitForIndexBuildToStop(testDB);
 
// With two phase index builds, a stepdown will not abort the index build, which should complete
// after the node becomes primary again.
rst.awaitReplication();
 
// >>> This line FAILS <<< 
IndexBuildTest.assertIndexes(coll, 2, ['_id_', 'a_1'], [], {includeBuildUUIDs: true});
 
const secondaryColl = rst.getSecondary().getCollection(coll.getFullName());
IndexBuildTest.assertIndexes(secondaryColl, 2, ['_id_', 'a_1'], [], {includeBuildUUIDs: true});
 
rst.stopSet();
})();

Sprint: Execution Team 2020-01-13, Execution Team 2020-01-27
Participants:
Linked BF Score: 17

 Description   

The sequence is as follows:
On the primary node:

See this patch build.



 Comments   
Comment by William Schultz (Inactive) [ 31/Jan/20 ]

louis.williams It looks like this ticket added NamespaceNotFound as an acceptable error to ignore during oplog application of "commitIndexBuild" and "startIndexBuild" entries. It seems we also need to do so for "abortIndexBuild". I ran into a failure in a patch build where we crash when applying "abortIndexBuild" in initial sync due to a NamespaceNotFound error thrown here. Here is a repro of the case (initsync_fuzzer-f628-1580268652754-5.js), which you can run locally with

python buildscripts/resmoke.py --suite=initial_sync_fuzzer initsync_fuzzer-f628-1580268652754-5.js

Comment by Githook User [ 18/Jan/20 ]

Author:

{'username': 'louiswilliams', 'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-44953 Secondaries should restart index builds when a commitIndexBuild oplog entry is processed but no index build is active

Additionally, only abort an index build after a user interrupt if we are still primary. During
two-phase index builds, rely on the new primary to finish the index build.
Branch: master
https://github.com/mongodb/mongo/commit/bf86a1d448dce7bd9383f357d5834a1e07a0e0b2

Comment by Louis Williams [ 09/Jan/20 ]

To fix this, when we receive user interruptions, only abort index builds if we are still primary after reacquiring locks. Otherwise, let the new primary finish the index build.

Comment by Louis Williams [ 07/Jan/20 ]

To address my previous concern:
> If a majority of nodes lose knowledge of an index build, the primary may hang for a majority of nodes to commit.

I do not consider this a serious issue because this race condition is inherently hard to hit. I am relatively confident that this is not an area where we want to focus on performance. This also goes for SERVER-45382.

Comment by Louis Williams [ 06/Dec/19 ]

I think the best way to solve this problem is to introduce a safe fallthrough on the secondary: if a "commitIndexBuild" oplog entry is recieved and no index build is in-progress, start the index build and block. This should be a reasonable fallback in any future case where a stepdown causes an index build to be cleaned up incorrectly.

Update: With the introduction of a commitQuorum, we should consider the case where this scenario happens to a majority of secondaries. If a majority of nodes lose knowledge of an index build, the primary may hang for a majority of nodes to commit.

Another option would be to modify the consensus protocol to allow a failing secondary to signal to the new primary to abort the index build.

Generated at Thu Feb 08 05:07:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.