[SERVER-61251] Ensure long running storage engine operations are interruptible Created: 04/Nov/21 Updated: 06/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Backlog - Storage Engines Team |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Assigned Teams: |
Storage Engines
|
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
Long running storage engine operations don't have interrupt points and thus can block step down. |
| Comments |
| Comment by Judah Schvimer [ 22/May/23 ] |
|
Thanks alexander.gorrod@mongodb.com, it was a general idea from a Workload Management discussion. I don't think it's critical. |
| Comment by Alexander Gorrod [ 22/May/23 ] |
|
Is that a question for WiredTiger judah.schvimer@mongodb.com? As in, could a transaction be paused and restarted? The answer is not safely - we built a mechanism a bit like that for prepared transactions, and the corner cases are numerous and annoying (we are in fact still chasing them, for example. We are taking small steps in that direction, but I expect it will be a fair while before it's a tractable amount of work in WiredTiger. Let me know if it's important enough to justify doing a bit more design than my gut response and we'll give a more complete answer. |
| Comment by Judah Schvimer [ 19/May/23 ] |
|
A related question is if long running WT operations can be yieldable, especially for Workload Management once we start prioritizing amongst running queries. |
| Comment by Louis Williams [ 19/Apr/23 ] |
|
I wanted to link to a comment that sue.loverso@mongodb.com left on WT-10892. In an attempt to reduce the overhead and risk of frequent interrupt checking in fast code paths, what we need at a minimum is to check for interrupts in the slow WiredTiger code paths that can block indefinitely, specifically in the eviction worker loop. What we want is to be able to pull operations out of the eviction loop, since that is where we find operations blocked inside WiredTiger most of the time. |
| Comment by Dianna Hohensee (Inactive) [ 18/Apr/23 ] |
|
Some additional information from working on We ended up passing a pointer to the opCtx into the storage layer for WT::compact interrupt checks. So if the opCtx in the MDB layer gets interrupted, then WT::compact can see that eventually and quit. This could be expanded generally for all MDB operations (all have an opCtx) accessing the WT layer. I think this solution addresses the concern that operations would be immediately resubmitted to WT and nothing would be gained: operations in the MDB layer are interrupted today with positive effect – not just restarted immediately with no change. |
| Comment by Alexander Gorrod [ 03/Apr/23 ] |
|
Notes from a group conversation about this: The primary use case to be addressed here is short running operations that can sometimes be held for a long time by WiredTiger. Ideally, WiredTiger would provide a mechanism by which MongoDB could notify us that an operation should be interrupted. keith.smith@mongodb.com indicated that there is an existing callback mechanism in place for compact that can trigger an interrupt. I indicated that it's important for server and storage engines engineers to collaborate on this work. It wouldn't be generally beneficial to give up on operations if the response is for the server to resubmit the same work to WiredTiger in a new transaction. judah.schvimer@mongodb.com indicated that a Storage Execution engineer would be most suitable from the server team to work on this. steve.kuhn@mongodb.com If you think this work is worthwhile, we should create a WiredTiger ticket to track the Storage Engines work and figure out how to get the time scheduled. |
| Comment by Alexander Gorrod [ 13/Mar/23 ] |
|
I believe that compact is a different case to the others referenced in this ticket. Compact is a complex, long running command in WiredTiger. It is reasonable to add interrupt points into compact and return early if the caller wants that behavior. Other cases described here were a WT_SESSION::commit or WT_SESSION::abort can take a long time to complete are different. There are two possible causes for those APIs to take a long time: They are resolving a prepared transaction which can be expensive. The WiredTiger cache is over-subscribed (generally due to being overwhelmed by too much concurrent activity), and the transaction resolution is being tasked with helping ensure the cache doesn't become oversubscribed (which can lead to processes being killed by the operating system). steve.kuhn@mongodb.com Could we chat about what to do with this ticket? It bounces back to us periodically, and it would be nice to have a concrete plan. Is that something you could help create? |
| Comment by Fausto Leyva (Inactive) [ 21/Feb/23 ] |
|
One potential idea is to generalize the solution from compact when we made it interruptible. |
| Comment by Lingzhi Deng [ 09/Nov/21 ] |
|
For the help ticket, mongo::WiredTigerRecoveryUnit::_commit and probably mongo::WiredTigerRecoveryUnit::_abort too. But I am not sure what does it mean to interrupt an abort. More generally though, I think we want a solution to make all storage operations interruptible or have a way at the mongodb layer to avoid being blocked on a storage operation. |
| Comment by Gregory Noma [ 09/Nov/21 ] |
|
lingzhi.deng which operation was this? Any additional information that would be useful here? |