[SERVER-33683] Allow aggregation $mergeCursors stage to run inside a transaction Created: 05/Mar/18 Updated: 29/Oct/23 Resolved: 17/Dec/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.7 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Charlie Swanson | Assignee: | Ian Boros |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Query 2018-12-03, Query 2018-12-17, Query 2018-12-31 | ||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
We banned aggregations within transactions on mongos as part of |
| Comments |
| Comment by Githook User [ 17/Dec/18 ] |
|
Author: {'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}Message: |
| Comment by Kaloian Manassiev [ 29/Nov/18 ] |
|
I agree that checking-in the session should also stash the resources - sorry I missed this while working on the dependent
The original proposal had suggested also keeping a track of a sub-session token, so that only branches from the same invocation can check it out. This means a significant change to the networking protocol though and we decided not to do it just in order to counteract misbehaved client. |
| Comment by Charlie Swanson [ 28/Nov/18 ] |
|
ian.boros while the case you describe is indeed worrisome, I can't think of anything great we could do to prevent it, and I believe it would already be a problem with something like an ordered bulk write operation through mongos. I think we will just have to be careful that we don't write any invariants that depend on well-behaved clients. This also came up during initial discussions with kaloian.manassiev and we couldn't think of a great way around this. |
| Comment by Ian Boros [ 28/Nov/18 ] |
|
Oh, James Wahlin (not tagging) just made a really insightful point: Assuming we take the above approach, there's a possibility that while the $mergeCursors is blocking (with the transaction resources stashed), some other operation could start running under the same transaction before the getMore has a chance to check out the session (this might require a poorly-behaved driver). I imagine this could break a lot of things. |
| Comment by Ian Boros [ 28/Nov/18 ] |
|
After looking into this ticket, having an operation yield its session while it blocks won't be enough to make this work. After an operation running $mergeCursors yields its session and blocks, the operation running getMore (on the same node) will check the session out, but then realize that the transaction is already in progress, though no resources have been stashed (they're still in use by the aggregation running $mergeCursors). This causes it to abort the transaction. To get around this, I believe $mergeCursors will also have to stash its transaction resources while it blocks on the remote requests, and unstash them when it resumes. This way, the getMore will find (and unstash) the resources once it checks out the session. kaloian.manassiev Does this seem reasonable? |
| Comment by Charlie Swanson [ 28/Sep/18 ] |
|
This impacts a wide variety of aggregations that can be run. Anything involving a $lookup or $graphLookup, anything which uses allowDiskUse: true in combination with a $sort or $group, and possibly some other cases. I can't give you an estimate on what percentage of aggregations this would be, but my guess would be at least 10%. These cases aren't that outlandish. Once again, the impact here is that these aggregations would not be able to be run within a transaction - I have no idea how common that would be. It seems like a strange, surprising, and undesirable restriction to have to me, so I certainly think it should be at least 4.1 Required - I can't say which epic it would be in. |
| Comment by Kaloian Manassiev [ 28/Sep/18 ] |
|
Yes, that's what I was looking for, thanks Jack. charlie.swanson, do you mind summarizing what are the customer-visible limitations if we were to leave shard merging disabled in 4.2 when you get a chance? I want to make sure we don't forget to do that as part of the final sharding support Epic or if we think it is not that important we should make it 4.1 Desired. |
| Comment by Jack Mulrow [ 28/Sep/18 ] |
|
kaloian.manassiev, misha.tyulenev added this uassert in ClusterAggregate in |
| Comment by Kaloian Manassiev [ 27/Sep/18 ] |
|
Thanks for the detailed explanation charlie.swanson and sorry for the late reply. Currently, sharded $lookup with transactions is not in the scope of PM-834, but it looks like other aggregation scenarios, which involve $mergeCursors will not work either. I have no idea what is the importance of these scenarios from usability point of view - do you mind explaining that to me? I am trying to decide whether this ticket should remain in the current Epic, move to the final transactions support Epic (PM-564) or we can just make it 4.1 Desired and we get to it, if we get to it. jack.mulrow@mongodb.com, I remember in 4.0 we did something to disable these scenarios, but I couldn't find the associated tickets or commits. Can you remind me what did we do to disable them? |
| Comment by Charlie Swanson [ 23/Aug/18 ] |
|
kaloian.manassiev we have not officially designed or written down an approach for anything. This ticket is not necessarily tied to the sharded $lookup epic - since this is a problem today. The sharded $lookup project will introduce another similar scenario to this one - but I'll get back to that later. The problem for this ticket is more of a self-deadlock within one shard on the state associated with the transaction. It does not necessarily involve a $lookup. If you run the aggregation [\{$group: {_id: "anything"}}], with {allowDiskUse: true}, it will perform a partial group on each shard, then the "group of groups" on a designated merging shard. That merging shard can now have two simultaneous operations going on in the same transaction. First, the operation performing the merge to compute the "group of groups"; This operation will be merging multiple cursors from every shard with chunks for the collection, one of which may be itself. Second, the operation computing the sub-group from this shard. While the first operation is running, it will at some point schedule a getMore on a cursor which the same shard owns. I don't recall all the details, but I think the Session object has to be "checked out" for a transaction, and could only be checked out by one operation. The first operation would check it out, schedule the getMore, then the getMore would be blocked waiting for it to be checked back in - which it never could be until the getMore returned. Today (in a world without sharded $lookup), we could pursue a fix for this ticket which would avoid sending network requests to ourselves to perform that merge. This would essentially avoid going back through the top of command processing which will attempt to check out the transaction state. It will take a refactor to the AsyncResultsMerger to do so, since the AsyncResultsMerger only knows how to merge cursors which have IDs and can be iterated with getMore commands - but it seems possible. However, when we start to think about sharded $lookup - we will be able to get into a scenario where two operations are happening simultaneously on one shard as part of the same transaction - and there will be no real way to avoid it. It's a little complicated to set up, but you could end up with a pipeline where one shard can be doing a $lookup which goes to a remote shard, which then has to do a $lookup again and goes back to the original shard. In order to support this scenario, the fix would likely have to involve having the ability to have multiple operations use the same Session object at once. |
| Comment by Kaloian Manassiev [ 23/Aug/18 ] |
|
Now that the sharded transactions work is coming along, I would like to get back to this ticket and to see how we can meaningfully get $lookup to work with sharded transactions. Have you guys considered this as part of the Sharded $lookup project and what it would take to fix it? Is it just the local shard which is a problem or also there was a case where a request from a different shard could come back? To: charlie.swanson, james.wahlin |
| Comment by Charlie Swanson [ 23/Mar/18 ] |
|
misha.tyulenev we'd like to have a discussion with the query team about how to fix this, but depending on the approach any engineer could work on this. |
| Comment by Misha Tyulenev [ 07/Mar/18 ] |
|
Thanks Charlie, thats correct. |
| Comment by David Storch [ 07/Mar/18 ] |
|
Yes, helpful, thanks for the clarification. I'll retitle the ticket to help clarify this. |
| Comment by Charlie Swanson [ 07/Mar/18 ] |
|
I think these are separate work items, but |
| Comment by David Storch [ 07/Mar/18 ] |
|
misha.tyulenev charlie.swanson, we may have to coordinate on this. It's not clear to me how this is different from |