[SERVER-31047] Rollback doesn’t properly cancel out index drop/create operations when there are multiple indexes on the same collection Created: 11/Sep/17  Updated: 30/Oct/23  Resolved: 24/Oct/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.5.12
Fix Version/s: 3.6.0-rc2

Type: Bug Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: William Schultz (Inactive)
Resolution: Fixed Votes: 0
Labels: rbfz
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

load('jstests/replsets/libs/rollback_test.js');
 
// Set up the cluster.
const rollbackTest = new RollbackTest('repro');
let testDb = rollbackTest.getPrimary().getDB('test');
 
// Ensure collections are created.
testDb.coll.insert({x: 1});
 
// Force subsequent operations to be rolled back.
rollbackTest.transitionToRollbackOperations();
testDb = rollbackTest.getPrimary().getDB('test');
 
// These are the operations that trigger the invariant failure.
testDb.runCommand({createIndexes: 'coll', indexes: [{key: {x: 1}, name: 'x_1'}]});
testDb.runCommand({createIndexes: 'coll', indexes: [{key: {y: 1}, name: 'y_1'}]});
testDb.runCommand({dropIndexes: 'coll', index: 'x_1'});
 
rollbackTest.transitionToSyncSourceOperations();
 
// Allow the rollback to happen.
rollbackTest.transitionToSteadyStateOperations({waitForRollback: true});
rollbackTest.stop();

Sprint: Repl 2017-11-13
Participants:

 Description   

The behavior of rollback’s FixUpInfo::removeRedundantIndexCommands method is incorrect when there is a rollback of an operation sequence with multiple index creations on the same collection. Currently, it has the following invariant:

invariant((*indexes).second.count(indexName) == 1)

which asserts that the number of indexes that exist in indexesToCreate for the current collection UUID must be exactly 1. This, however, may not be true, if we are rolling back a createIndex operation that wasn’t dropped on the current collection (see Steps To Reproduce script). This invariant must be removed, since it is incorrect and may not hold in legitimate rollback scenarios.

The presence of this erroneous invariant masks an additional issue. The removeRedundantIndexCommands doesn’t check if the given index name actually exists in indexesToCreate (for the given UUID), only if there is an entry with the same UUID. If it does not exist, we will try to erase it from indexesToCreate, which will do nothing, but then we still return true, which causes us to bypass the addition of the index to the indexesToDrop here. We need to change this behavior so that we check for the presence of the specific index, not just the index's UUID, and only return true if we actually found the right index.



 Comments   
Comment by William Schultz (Inactive) [ 15/Nov/17 ]

robert.guo Are there things that need to be re-enabled in the rollback fuzzer since the bug described in this ticket was fixed?

Comment by Githook User [ 24/Oct/17 ]

Author:

{'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}

Message: SERVER-31047 Rollback properly removes redundant index operations
Branch: master
https://github.com/mongodb/mongo/commit/a4a94724dc82af8a314f98c2245d4e61233f56bf

Comment by William Schultz (Inactive) [ 23/Oct/17 ]

Code Review: https://mongodbcr.appspot.com/169250001/

Comment by William Schultz (Inactive) [ 03/Oct/17 ]

Since this issue will trigger the invariant mentioned above, it will simply be a crash. If we removed the invariant then it could lead to data corruption i.e. indexes out of sync between nodes. I don't think this is an issue with 3.4 or earlier, since there is not really any logic in 3.4 rollback that directly corresponds to the removeRedundantIndexOperations method.

Comment by Spencer Brody (Inactive) [ 02/Oct/17 ]

william.schultz, can this result in data corruption, or just a process crash? Also is this a regression in 3.6, or has this existed forever?

Generated at Thu Feb 08 04:25:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.