[SERVER-45916] On primary, 2-phase index build cleanup writes an abortIndexBuild oplog entry under a stronger mode user collection lock X which can lead to 3 way deadlock with prepared transactions, step down and index build Created: 31/Jan/20  Updated: 29/Oct/23  Resolved: 17/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-46560 Make Abort index build logic determin... Closed
Related
is related to SERVER-45921 Index builder invariants on this chec... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-04-06, Execution Team 2020-04-20, Execution Team 2020-05-04
Participants:

 Description   

Consider the following sequence,
1) Start an index build on collection A on primary.
2) Prepare the transaction on collection A.
3) Index build gets aborted can be possibly due to some killOp cmd or due to some key constraint errors.
4) As a result of index build failure, it tries to do the cleanup phase. Assume, it's here. So, index build thread has acquired RSTL in mode IX and the uninterruptible lock guard is enabled.
5) Now, assumed stepDown cmd comes in. So, it's going to enqueue the RSTL in mode X. But, blocked behind the index build thread.
6) Now, the index builder thread tries to acquire collection lock in X mode to write the abortIndexBuild oplog entry and to tear down the index build. But this step, gets blocked behind prepared transaction due to collection lock conflict.
7) Prepared transaction's commit command blocks behind the step down thread.



 Comments   
Comment by Githook User [ 17/Apr/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-45916 Index build abort should be interruptible by stepDown
Branch: master
https://github.com/mongodb/mongo/commit/65c979794dc643c0b432be89de47bda7c8e7bd8c

Comment by Louis Williams [ 16/Apr/20 ]

Code review: https://mongodbcr.appspot.com/575110005/

Comment by Suganthi Mani [ 31/Jan/20 ]

I would expect something like this, if a old primary index build gets aborted for some reason except due to killop cmd, then the newly elected primary also will also abort due to same rason and send an abortIndexBuild oplog entry.

If that's the case, then, when the killop cmd was successfully able to interrupt the parent createIndex thread, the parent createIndex thread after aborting the index build coordinator thread (i.e., changing the aborted field to true), parent thread should generate the abortIndex oplog entry. To make it work correctly we should also make sure the parent thread holds RSTL in mode IX once we have started the index build. (We are already guaranteeing that by holding the RSTL here)

For other index build abortion due to some erroneous data records, we can behave like we do it for commit index build. The newly elected primary take care of abort and generate abortIndex oplog entry.

Comment by Suganthi Mani [ 31/Jan/20 ]

milkie response for this problem in the google doc

Ah - I see. The solution is: at stepdown time, we must cancel any active index-abort operations. The killop cmd is ignored. The index build remains active on all nodes, and the new primary continues building the index.

This applies to any index-commit operations as well (they take the same locks as index-abort). The commit operation is cancelled at stepdown time, and the new primary continues the index build, which will probably be to immediately commit.

Generated at Thu Feb 08 05:10:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.