[SERVER-42621] 3 way deadlock can happen between hybrid index build, prepared transactions and stepdown thread. Created: 03/Aug/19 Updated: 29/Oct/23 Resolved: 28/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Suganthi Mani |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Execution Team 2019-08-12, Repl 2019-08-26, Repl 2019-09-09 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 7 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Currently, we can see a 3 way deadlock between hybrid index builder, prepared txn and step down thread for the above repro. The problem is that when step down thread kills "createIndex" cmd thread. As part of index teardown step, on primary, MultiIndexBlock::cleanUpAfterBuild is called with RSTL held in IX mode which then tries to acquire X lock on user collection in an uninterruptible lock guard but gets blocked behind prepared transaction due to collection lock conflict. Since createIndex is holding RSTL in IX mode, it blocks step down thread. CommitTransaction cmd waiting to acquire RSTL lock in IX mode gets blocked behind the step down thread as the step down thread has enqueued RSTL lock in X mode. |
| Comments |
| Comment by Githook User [ 31/Oct/19 ] |
|
Author: {'username': 'smani87', 'email': 'suganthi.mani@mongodb.com', 'name': 'Suganthi Mani'}Message: (cherry picked from commit 3693ad5f9031c59d8a0646337f5c3bb3c818d49b) |
| Comment by Githook User [ 28/Aug/19 ] |
|
Author: {'name': 'Suganthi Mani', 'username': 'smani87', 'email': 'suganthi.mani@mongodb.com'}Message: |
| Comment by Suganthi Mani [ 21/Aug/19 ] |
|
Spoke to benety.goh offline, and we agreed that this parent ticket will be used to add the testing(repro script) for the 3 way deadlock and will be pushed to master after all the underlying sub-tickets are addressed. |
| Comment by Benety Goh [ 19/Aug/19 ] |
|
suganthi.mani, thanks for the detailed analysis. I've filed |
| Comment by Suganthi Mani [ 16/Aug/19 ] |
|
benety.goh, found the reason that was causing memory corruption in the patch build. When the transaction gets aborted, we also try to rollback the changes that we did during the txn to IndexBuildInterceptor::_sideWritesCounter. But, the problem is, we are accessing the member of an object instance (i.e.) _indexBuildInterceptor that was dropped/destructed as part of index build cleanup. |
| Comment by Eric Milkie [ 09/Aug/19 ] |
|
I'm having a think on ways to mitigate the tangle between index build cleanup and prepared transactions that have registered Changes that attempt to access the index build's temporary tables. |
| Comment by Suganthi Mani [ 08/Aug/19 ] |
|
Currently, there is a problem of dropping RSTL lock by index cleanup step. Though, the index cleanup step doesn't need synchronization with replica state transition but it depends on prepared transaction. I am attaching the abort.js script to this ticket which simulate the below scenario. 1) Prepared txn does some writes to the index builder internal sideWrites table and performs size adjustment to the sideWrites table by holding user collection lock in IX mode. |
| Comment by Eric Milkie [ 06/Aug/19 ] |
|
I think the best thing to do here is to not hold the RSTL, since index building doesn't need synchronization with replica state transitions (it is handled specially). Removing the UninterruptibleLockGuard for index build abort isn't possible in the current code (I tried and failed). |
| Comment by Judah Schvimer [ 05/Aug/19 ] |
|
milkie, I've moved this to the execution team since I expect the solution to be in the indexing code, but I'm happy to help brainstorm other solutions and take it back onto the replication team if the solution lies outside of the indexing code. |
| Comment by Judah Schvimer [ 05/Aug/19 ] |
|
I feel like the best fix here will be to remove the UninterruptibleLockGuard. milkie, Is that possible? The alternative I can imagine (though haven't fully thought through) is for all of index building on the primary, after writing the oplog entry, to not hold the RSTL. We could drop it like we do for prepared transactions as soon as we no longer need the node to be a primary. We made a similar fix to index building on secondaries. |
| Comment by Suganthi Mani [ 05/Aug/19 ] |
|
This is a problem even on 4.2 as MultiIndexBlock::cleanUpAfterBuild() tries to acquire X lock for user collection with RSTL held in IX mode in an UninterruptibleLockGuard. |