Single-phase index builds can encounter a deadlock, or worse crash, if an index build fails concurrently with an abort operation (dropIndexes, dropCollection, etc).
The deadlock is as follows:
- A single-phase index build fails while scanning a collection. It releases all of its locks and is cleaned up by re-acquiring the Collection X lock
- During this period of time, a dropCollection operation can abort the index build by acquiring the Collection X lock. It then waits for the index build thread to exit before releasing its lock.
- The index builder is blocked waiting for the X lock, and the aborting thread is blocked waiting for the index builder to exit.
We should do as two-phase builds do, and only check key constraints upon completion (SERVER-45852). This allows us to abort without releasing locks. This is only possible, however, after SERVER-46989 is complete.
- depends on
-
SERVER-46989 Index builds should hold RSTL to prevent replication state changes after deciding to commit or abort
- Closed
-
SERVER-47692 SharedBufferFragmentAllocator should discard its buffer on destruction if the build was incomplete
- Closed
- is related to
-
SERVER-45351 Newly-elected primaries can commit index builds with inconsistencies due to ignoring indexing errors as secondary
- Closed
- related to
-
SERVER-69677 Add warning to index build unexpected error code invariant and only enable in debug builds
- Closed
-
SERVER-48160 remove IndexBuildsCoordinator::_tryAbort() fatal assertion 4656001 for missing index build thread
- Closed