[SERVER-43403] Index creation may be allowed to start on a collection in drop-pending state Created: 20/Sep/19  Updated: 29/Oct/23  Resolved: 21/Nov/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.3.2

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-43692 enable two phase index builds by default Closed
Problem/Incident
Related
is related to SERVER-32567 Replace queryoptimizer3.js with FSM test Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

'use strict';
 
var $config = (function() {
    var data = {
        prefix: 'create_collection'
    };
 
    var states = (function() {
        function init(db, collName) {
            this.num = 0;
        }
 
        function dropAndCreate(db, collName) {
            db[collName].drop();
            assertAlways.commandWorked(db[collName].insert({x:1}));
        }
 
        function createIndex(db, collName){
            db[collName].createIndex({x:1});
            db[collName].dropIndex({x:1});
        }
 
        return {init: init, dropAndCreate: dropAndCreate, createIndex: createIndex};
    })();
 
    var transitions = {init: {dropAndCreate: 0.5, createIndex: 0.5},
        dropAndCreate: {dropAndCreate: 0.5, createIndex: 0.5},
        createIndex: {dropAndCreate: 0.5, createIndex: 0.5}};
 
    return {
        threadCount: 10,
        iterations: 20,
        data: data,
        states: states,
        transitions: transitions,
    };
})();

Sprint: Execution Team 2019-12-02
Participants:
Linked BF Score: 38

 Description   

When we run a createIndexes command, we will look up the UUID of the specified collection under a lock here. We release this lock, however, and later on we will look up the namespace for the collection here, which is the namespace we use later on in the index creation operation. It is possible that a 'drop' command has been called during the time we released our locks, and it could have renamed the collection to a "drop pending" namespace, if we are still using the old two phase collection drop implementation (used when EMRC=false). When we look up the namespace, we will then be looking up the drop pending namespace. It seems that we shouldn't allow a collection drop to interleave with a createIndexes command like this.



 Comments   
Comment by Githook User [ 20/Nov/19 ]

Author:

{'email': 'benety@mongodb.com', 'name': 'Benety Goh', 'username': 'benety'}

Message: SERVER-43403 disallow drops and renames during index creation
Branch: master
https://github.com/mongodb/mongo/commit/13c3960a0c30e05d8a3b25e5bf9669e268933ec7

Comment by Benety Goh [ 15/Oct/19 ]

While we are discussing queryoptimizer3.js, I would love to revisit SERVER-32567.

Comment by Githook User [ 10/Oct/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: Revert "SERVER-43403: Check index creation namespace after reacquiring locks."

This reverts commit 458987d4184fc6e5995b91980b5ea79c83cb86b3.
Branch: master
https://github.com/mongodb/mongo/commit/83af82bd2f18c39154414e0fe77ca332603657bf

Comment by Githook User [ 10/Oct/19 ]

Author:

{'username': 'dgottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb'}

Message: SERVER-43403: Check index creation namespace after reacquiring locks.
Branch: master
https://github.com/mongodb/mongo/commit/458987d4184fc6e5995b91980b5ea79c83cb86b3

Comment by Benety Goh [ 09/Oct/19 ]

daniel.gottlieb, adding a check to the IndexBuildsCoordinator here should be sufficient.

Comment by Daniel Gottlieb (Inactive) [ 09/Oct/19 ]

To start: I'm unable to successfully reproduce the bug with the provided script. I'm confident the analysis is correct (and the BFGs are still rolling in), so we'll call my experience an outlier.

Additionally, this sort of bug falls into the class of "after releasing/re-acquiring locks, preconditions must be rechecked". One precondition that's not being rechecked, as the ticket notes is the namespace check (though that's done on the input string before locks are acquired in the first place).

There are other "courtesy" checks in the create indexes file and I'm unsure which ones are rechecked in the build coordinator and which ones should be. These checks include:

  • Checking sharding state
  • Checking that the node is a primary
  • Checking that the index does not yet exist (that one is mostly clearly documented to prevent a spurious Collection X lock acquisition).

benety.goh should I simply add the drop pending namespace check to the index build coordinator, or should this ticket represent larger work (copying/moving all of the input validation over to the coordinator after the locks are reacquired)?

Comment by William Schultz (Inactive) [ 20/Sep/19 ]

Was able to repro this by running the attached test with:

python buildscripts/resmoke.py --majorityReadConcern=off --repeat=1 --suite=concurrency_replication jstests/drop_and_createIndex_concurrent.js

Generated at Thu Feb 08 05:03:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.