[SERVER-39976] Two-phase index builds on primaries should ignore interrupts Created: 06/Mar/19 Updated: 08/Jan/24 Resolved: 13/Feb/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.4 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||
| Sprint: | Execution Team 2019-09-23, Execution Team 2020-02-10, Execution Team 2020-02-24 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
A simultaneous index build should only be killable at shutdown when run on a secondary. We'll need to turn it off if the node steps up; and turn it on if the node steps down. InterruptedAtShutdown errors should still get through, but not killOp errors. |
| Comments |
| Comment by Eric Milkie [ 13/Feb/20 ] |
|
This commit changes the killop behavior for index builds. Now, the only way to kill an index build is to killOp the connection operation (or use the dropIndexes command). Previously, it was also possible to kill an index build by using killOp on the operation from the index builds coordinator; this is no longer possible. |
| Comment by Githook User [ 07/Feb/20 ] |
|
Author: {'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}Message: |
| Comment by Louis Williams [ 24/Oct/19 ] |
|
brian.lane, yes I seems like we've agreed to make index builds unkillable and require using the dropIndexes command to kill active builds instead, |
| Comment by Brian Lane [ 24/Oct/19 ] |
|
Hey benety.goh - I believe it does. If the decision was to take that approach, is that a fair statement louis.williams? I wasn't sure if louis.williams and mira.carey@mongodb.com discussed this offline as I wasn't sure what mira.carey@mongodb.com thought about -Brian |
| Comment by Benety Goh [ 18/Oct/19 ] |
|
brian.lane, louis.williams, does |
| Comment by Brian Lane [ 04/Oct/19 ] |
|
Hey louis.williams - I like the idea of using dropindex instead of adding another command. To mira.carey@mongodb.com's point as well - if he is accepting of -Brian |
| Comment by Louis Williams [ 02/Oct/19 ] |
|
brian.lane, can you read over my last comment and let me know what you think? Dianna brought up that it will lead to unexpected killOp behavior, which is true. The suggestion I made was to provide an alternative option to killing indexes, using dropIndex, or potentially a new command. mira.carey@mongodb.com one of the main goals of this project is to have index builds survive stepdowns, so yes they will continue to be exceptional. Index builds yield locks frequently, but there is one main critical section. This is a long section at the end of the build where writes are blocked and the final entries are applied to the index. We do plan on making the length of that critical section as short as possible, see |
| Comment by Mira Carey [ 01/Oct/19 ] |
|
To chime in here, we currently have a policy of avoiding making opCtx's unkillable unless absolutely necessary. In general, having unkillable ops makes it hard to reason about whether stepdown can succeed, reduces our ability to resolve things like prepare conflicts and generally indicates to me that we have some unfortunately long and dangerous critical section buried somewhere (if the operation fails for some other reason, what state have we left the system in). Index builds have long been the exception to the rule, but every time we double down there I'm going to want to bring this up again and make sure we've exhausted our alternatives. |
| Comment by Dianna Hohensee (Inactive) [ 01/Oct/19 ] |
|
louis.williams, you should run the idea by product: making something on a primary unable to be killed. This will make the killOp cmd not work against everything: I believe today the only ops it cannot kill are replication threads on secondaries. |
| Comment by Louis Williams [ 01/Oct/19 ] |
|
After discussing with geert.bosch, I think we want to take the following approach instead:
This work for this ticket should be to simply make all index build threads uninterruptible except at shutdown. |
| Comment by Dianna Hohensee (Inactive) [ 20/Sep/19 ] |
|
For what it's worth, we have already have a OperationContextGroup to mass modify OperationContexts; and you could add stepUp and stepDown hooks to the IndexBuildsCoordinator to mark OperationContexts as interruptible or impervious – along with getting rid of the code block of un-interruptibility from IndexBuildsCoordinator and replacing it with more flexible OperationContext functions to set and unset interruptibility modes. |
| Comment by Louis Williams [ 20/Sep/19 ] |
|
After more discussion, we have considered reverting the behavior that makes secondaries ignore interrupts. In 4.0 and before, interrupting a secondary index build crashed a node. The behavior added to 4.0 anticipated the case where a system administrator uses killop on a primary and races with stepdown. This is a hard case to justify considering the amount of work required, and it is not a correctness problem if the node crashes. Considering the interruptibility due to step-up is a little more complicated. We may still need error code whitelist to pass to _runWithoutInterruption to ignore "InterruptedDueToReplStateChange" in addition to "InterruptedAtShutdown". If not a whitelist, we may still need to implement my first proposed solution. This is not ideal and makes understanding the interruptibility of operations more complicated based off of transient state, instead of the code as it is written. |
| Comment by Louis Williams [ 10/Sep/19 ] |
|
I think this is going to require some interesting service arch changes. At the moment, interrupt suppression is handled by an IgnoreInterruptsGuard, which effectively sets a flag, _ignoreInterrupts, on the OperationContext to early-return in checkForInterruptNoAssert(). Once the _buildIndex() function is called to actually build the index, the _ignoreInterrupts state is no longer modified by the OperationContext, and stays that way until the function returns. This is problematic because we need a way to dynamically control whether this function is interruptible, depending on the replication state. I can think of a couple of options:
mira.carey@mongodb.com, do you mind giving some feedback on whether or not these ideas are realistic and could manifiest in a service arch ticket? |