Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43206

All callers of DatabaseImpl::dropCollectionEvenIfSystem must first assert under a lock that no index builds are in progress



    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Gone away
    • None
    • None
    • Storage
    • ALL
    • Execution Team 2020-04-06
    • 9


      TLDR: DatabaseImpl::dropCollectionEvenIfSystem invariants against index builds in progress by looking at the IndexCatalog state. However, the IndexBuildsCoordinator needs to clean up state after an index build, expecting the collection to continue to exist, without a lock.

      As described in the linked test failure, DatabaseImpl::dropCollectionEvenIfSystem, which is always called under a collection X lock (an invariant), will invariant against any index builds being in progress according to the IndexCatalog. However, the IndexCatalog will be told an index build is complete before the IndexBuildsCoordinator (MultiIndexBlock::cleanUpAfterBuild, IndexBuildInterceptor, etc.) state has been cleaned up. We no longer hold a lock across index build tear down, so DatabaseImpl::dropCollectionEvenIfSystem is able to go ahead and drop the collection, when the index build still needs it to exist. All callers of DatabaseImpl::dropCollectionEvenIfSystem, therefore, must check with BackgroundOperation and IndexBuildsCoordinator that no index builds are in progress – clearly one of the callers is not doing this check, because of the test failure (I didn't go ahead and look for which caller, because there are a lot of them to look at).

      That's ^ one proposal for how to fix the problem. I haven't thought about other options. It isn't great because there's no way to enforce new callers of DatabaseImpl::dropCollectionEvenIfSystem don't mess up.

      Prior to SERVER-42487, this bug was feasible, but much less likely to occur. SERVER-42487 Moved index build database and collection locks into a smaller scope because it wasn't guaranteed that the locks would still be locked after that smaller scope ran, in the larger scope. So previously, index builds would have been unprotected only when the index build threw an error; now, it's always unprotected, surfacing this bug – at least I think it's a bug, and if it isn't, then we need better documentation that successful index builds need a lock and unsuccessful index builds don't need a lock for tear down.


        Issue Links



              gregory.wlodarek@mongodb.com Gregory Wlodarek
              dianna.hohensee@mongodb.com Dianna Hohensee
              0 Vote for this issue
              4 Start watching this issue