[SERVER-44343] Make 'reIndex' a standalone-only command Created: 31/Oct/19  Updated: 29/Oct/23  Resolved: 16/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: 4.2.1, 4.3.1
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Samyukta Lanka
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-13581 Investigate changes in SERVER-44343: ... Closed
Problem/Incident
Related
related to SERVER-44026 Remove global X lock for reIndex Closed
is related to SERVER-47427 Check for standalone-only commands wh... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Steps To Reproduce:

Run the following script using --suites=replica_sets.

(function() {
    "use strict";
    load("jstests/core/txns/libs/prepare_helpers.js");
    
    const dbName = "test";
    const collName = "re_index_secondary_with_prepared_transaction";
 
    const rst = new ReplSetTest({
        nodes: [
            {},
            {rsConfig: {priority: 0}},
        ],
    });
    rst.startSet();
    rst.initiate();
    
    const primary = rst.getPrimary();
    const secondary = rst.getSecondary();
    const testDB = primary.getDB(dbName);
    
    assert.commandWorked(testDB.runCommand({create: collName, writeConcern: {w: "majority"}}));
 
    const priSession = primary.startSession({causalConsistency: false});
    const priSessionDB = priSession.getDatabase(dbName);
    const priSessionColl = priSessionDB.getCollection(collName);
 
    priSession.startTransaction();
    assert.commandWorked(priSessionColl.insert({_id: 1}));
    const prepareTimestamp = PrepareHelpers.prepareTransaction(priSession);
 
    rst.awaitReplication();
 
    // Add 1 to the increment so that the commitTimestamp is "after" the prepareTimestamp.
    const commitTimestamp = Timestamp(prepareTimestamp.getTime(), prepareTimestamp.getInc() + 1);
 
    // Run 'reIndex' on the secondary.
    assert.commandWorked(secondary.getDB(dbName).runCommand({reIndex: collName}));
 
    assert.commandWorked(PrepareHelpers.commitTransaction(priSession, commitTimestamp));
    rst.awaitReplication();
 
    assert.commandWorked(secondary.getDB(dbName).runCommand({validate: collName}));
    
    rst.stopSet();
    })();

Sprint: Repl 2019-12-02, Repl 2019-12-16, Repl 2019-12-30, Repl 2020-01-13, Repl 2020-03-09, Repl 2020-03-23, Repl 2020-04-06, Repl 2020-04-20
Participants:
Linked BF Score: 14

 Description   

Trying to run the 'reIndex' command on a secondary while having unfinished prepared transactions will hit an invariant. After dropping the index that will be re-indexed, we hit an invariant when doing a collection scan to insert all the documents in the collection for that index.

Invariant failure lock.mode != MODE_S && lock.mode != MODE_X {6917529027641081857: Global, 1} in X src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h 111

ldeng recommended to try set the 'canIgnorePrepareConflicts' flag to true for the 'reIndex' command but that ended up hitting another invariant while trying to drop all the indexes.

WiredTiger error (95) [1572545292:296845][8488:0x7f381342b700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 463: Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw: [1572545292:296845][8488:0x7f381342b700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 463: Transactions with ignore_prepare=true cannot perform updates: Operation not supported
 
Invariant failure: ret resulted in status UnknownError: 95: Operation not supported at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 1442

We should consider disallowing the 'reIndex' command from running on secondaries and removing it in a future release.



 Comments   
Comment by Githook User [ 16/Apr/20 ]

Author:

{'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}

Message: SERVER-44343 Make 'reIndex' a standalone-only command
Branch: master
https://github.com/mongodb/mongo/commit/f7e7e994cb14bf349ea1ce9a86ec6012b7330b66

Comment by Githook User [ 13/Apr/20 ]

Author:

{'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}

Message: Revert "SERVER-44343 Make 'reIndex' a standalone-only command"

This reverts commit 2ad12521309aeccc6f712f4ccb532f113255d186.
Branch: master
https://github.com/mongodb/mongo/commit/72cf1283ee2c249dcab8c185f64ee6108c00b02c

Comment by Githook User [ 09/Apr/20 ]

Author:

{'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}

Message: SERVER-44343 Make 'reIndex' a standalone-only command
Branch: master
https://github.com/mongodb/mongo/commit/2ad12521309aeccc6f712f4ccb532f113255d186

Comment by Eric Milkie [ 18/Jan/20 ]

The reason why I asked is because I am unsure of the current use case for reIndex. Why would someone want to rebuild an index only on their primary node – I would think that the primary node would be the one node state that a user would least likely want to run the reIndex command.

Comment by Judah Schvimer [ 16/Jan/20 ]

This is not suggesting adding functionality to 'reIndex', just removing the ability to run it on secondaries. My understanding is 'reIndex' is currently supported on both primaries and secondaries, so after this it would only be supported on primaries. If a user wants to reindex an index on a secondary, they will need to restart the node as a standalone. Alternatively we could make it a standalone-only command, which evin.roesle would have to weigh in on. That would be even easier than making it primary-only.

Comment by Eric Milkie [ 16/Jan/20 ]

Would this new primary-only command be replicated?

Comment by Judah Schvimer [ 10/Dec/19 ]

It wouldn't be, I didn't understand how kIgnoreConflictsAllowWrites worked and why it was safe to use in other index building scenarios. louis.williams explained to me how it works and why it is unsafe here but safe in other scenarios. This was suggested in a repl team meeting and I wanted to try it, to gain understanding.

Together we thought of another solution, other than banning reIndex on secondaries. reIndex could uassert and fail whenever it hit a prepare conflict. This would mean that the user would lose their indexes since reIndex is not transactional, but that is a known problem with reIndex. This would exacerbate it though:

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
index 5212a0b19a..fbc94dc517 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
@@ -113,9 +113,17 @@ int wiredTigerPrepareConflictRetry(OperationContext* opCtx, F&& f) {
         // at commit time, these lock types are safe. Therefore, invariant here that we do not get a
         // prepare conflict while holding a global, database, or collection MODE_S lock (or MODE_X
         // lock for completeness).
-        if (type == RESOURCE_GLOBAL || type == RESOURCE_DATABASE || type == RESOURCE_COLLECTION)
-            invariant(lock.mode != MODE_S && lock.mode != MODE_X,
-                      str::stream() << lock.resourceId.toString() << " in " << modeName(lock.mode));
+        if (type == RESOURCE_GLOBAL || type == RESOURCE_DATABASE || type == RESOURCE_COLLECTION) {
+            if (lock.mode == MODE_S || lock.mode == MODE_X) {
+                if (CurOp::get(opCtx)->getCommand()->getName() == "reIndex") {
+                    uasserted(ErrorCodes::PreparedTransactionInProgress,
+                            "Got prepare conflict while holding lock on reIndex");
+                }
+                invariant(false,
+                          str::stream()
+                              << lock.resourceId.toString() << " in " << modeName(lock.mode));
+            }
+        }
     }
 
     if (MONGO_unlikely(WTSkipPrepareConflictRetries.shouldFail())) {

I think banning reIndex on secondaries and encouraging users to restart secondaries as standalones and call reIndex on the standalone would be preferable. evin.roesle, do you know if banning reIndex on secondaries would be an acceptable solution to this invariant?

Comment by Eric Milkie [ 10/Dec/19 ]

Why would ignoring prepared documents be safe for index builds on secondaries?

Comment by Judah Schvimer [ 10/Dec/19 ]

I added the following diff in the reIndex code, and while the test no longer invariants, it gets a dbhash mismatch saying the _id:1 document is missing on the secondary.

diff --git a/src/mongo/db/commands/drop_indexes.cpp b/src/mongo/db/commands/drop_indexes.cpp
index 033023ec71..515cf16b0a 100644
--- a/src/mongo/db/commands/drop_indexes.cpp
+++ b/src/mongo/db/commands/drop_indexes.cpp
@@ -223,6 +223,12 @@ public:
             quickExit(EXIT_ABRUPT);
         }
 
+        opCtx->recoveryUnit()->abandonSnapshot();
+        opCtx->recoveryUnit()->setPrepareConflictBehavior(
+            PrepareConflictBehavior::kIgnoreConflictsAllowWrites);
+
         // The following function performs its own WriteConflict handling, so don't wrap it in a
         // writeConflictRetry loop.
         uassertStatusOK(indexer->insertAllDocumentsInCollection(opCtx, collection));

Generated at Thu Feb 08 05:05:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.