[SERVER-31047] Rollback doesn’t properly cancel out index drop/create operations when there are multiple indexes on the same collection Created: 11/Sep/17 Updated: 30/Oct/23 Resolved: 24/Oct/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.5.12 |
| Fix Version/s: | 3.6.0-rc2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Robert Guo (Inactive) | Assignee: | William Schultz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | rbfz | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||||||||
| Sprint: | Repl 2017-11-13 | |||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||
| Description |
|
The behavior of rollback’s FixUpInfo::removeRedundantIndexCommands method is incorrect when there is a rollback of an operation sequence with multiple index creations on the same collection. Currently, it has the following invariant:
which asserts that the number of indexes that exist in indexesToCreate for the current collection UUID must be exactly 1. This, however, may not be true, if we are rolling back a createIndex operation that wasn’t dropped on the current collection (see Steps To Reproduce script). This invariant must be removed, since it is incorrect and may not hold in legitimate rollback scenarios. The presence of this erroneous invariant masks an additional issue. The removeRedundantIndexCommands doesn’t check if the given index name actually exists in indexesToCreate (for the given UUID), only if there is an entry with the same UUID. If it does not exist, we will try to erase it from indexesToCreate, which will do nothing, but then we still return true, which causes us to bypass the addition of the index to the indexesToDrop here. We need to change this behavior so that we check for the presence of the specific index, not just the index's UUID, and only return true if we actually found the right index. |
| Comments |
| Comment by William Schultz (Inactive) [ 15/Nov/17 ] |
|
robert.guo Are there things that need to be re-enabled in the rollback fuzzer since the bug described in this ticket was fixed? |
| Comment by Githook User [ 24/Oct/17 ] |
|
Author: {'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}Message: |
| Comment by William Schultz (Inactive) [ 23/Oct/17 ] |
|
Code Review: https://mongodbcr.appspot.com/169250001/ |
| Comment by William Schultz (Inactive) [ 03/Oct/17 ] |
|
Since this issue will trigger the invariant mentioned above, it will simply be a crash. If we removed the invariant then it could lead to data corruption i.e. indexes out of sync between nodes. I don't think this is an issue with 3.4 or earlier, since there is not really any logic in 3.4 rollback that directly corresponds to the removeRedundantIndexOperations method. |
| Comment by Spencer Brody (Inactive) [ 02/Oct/17 ] |
|
william.schultz, can this result in data corruption, or just a process crash? Also is this a regression in 3.6, or has this existed forever? |