[SERVER-77134] Search queries hold storage tickets while waiting for response from network Created: 15/May/23  Updated: 29/Oct/23  Resolved: 05/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0, 7.0.2, 7.1.0-rc2

Type: Bug Priority: Major - P3
Reporter: George Wangensteen Assignee: Alyssa Clark
Resolution: Fixed Votes: 0
Labels: greenerbuild, mongot-cross-team, query-product-scope-1, query-product-urgency-2, query-product-value-1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backports SERVER-70662 Concurrency Control Mechanism for Sea... Closed
Assigned Teams:
Query Integration
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0
Steps To Reproduce:

In comment

Sprint: Execution Team 2023-06-12, Execution NAMR Team 2023-07-24, Execution NAMR Team 2023-08-07
Participants:

 Description   

Aggregates with $search appear to hold their storage tickets while waiting for a response from the network. This can block other queries from proceeding/hang them while we wait for a response from the network.  There might be some reason it is not possible to drop/re-acquire those tickets while waiting for the network, but after talking to charlie.swanson@mongodb.com , we couldn't think of one. It feels like a bug to hold a contended resource like this while waiting for the network, so I figured I'd file a ticket to make sure. 



 Comments   
Comment by Githook User [ 08/Sep/23 ]

Author:

{'name': 'Alyssa Wagenmaker', 'email': 'alyssa.wagenmaker@mongodb.com', 'username': 'awagenmaker'}

Message: SERVER-77134 Release locks during search network requests
Branch: v7.0
https://github.com/mongodb/mongo/commit/7c3fa443ee36a68fe101bd54321c0aca1dd1e671

Comment by Githook User [ 07/Sep/23 ]

Author:

{'name': 'Alyssa Wagenmaker', 'email': 'alyssa.wagenmaker@mongodb.com', 'username': 'awagenmaker'}

Message: SERVER-77134 Release locks during search network requests
Branch: v7.1
https://github.com/mongodb/mongo/commit/1712b00a0ce27f13f80263636e01b4bd86b82937

Comment by Githook User [ 05/Sep/23 ]

Author:

{'name': 'Alyssa Wagenmaker', 'email': 'alyssa.wagenmaker@mongodb.com', 'username': 'awagenmaker'}

Message: SERVER-77134 Release locks during search network requests
Branch: master
https://github.com/mongodb/mongo/commit/1f02105e8f8fda9eecd59811a3b2c06d7d3bacab

Comment by Xiaochen Wu [ 10/Aug/23 ]

Spoke with joe.sack@mongodb.com, this is something to worry about but not actively on fire. So we should just schedule it as normal tasks. Definitely not a "drop everything and do it" ticket. charlie.swanson@mongodb.com arun.banala@mongodb.com brenda.rodriguez@mongodb.com 

Comment by Matt Kneiser [ 16/May/23 ]

Ideally, network operations should yield resources.

Comment by George Wangensteen [ 16/May/23 ]

louis.williams@mongodb.com should we mark this as 7.0 required? 

Comment by Louis Williams [ 16/May/23 ]

george.wangensteen@mongodb.com the Execution Control system was not designed specifically to avoid this problem, actually, so I would say we shouldn't rely on it to work. Because Execution Control is lowering the ticket limits below what we had before, I think we actually need to fix this for the release, otherwise, we risk unavailability in the system where we may not have had it before. I'm going to reassign to investigate on our end.

Comment by George Wangensteen [ 15/May/23 ]

Thanks louis.williams@mongodb.com ! Do you think I should close this as a dupe or won't fix then? FWIW, even with execution control enabled, I sometimes saw my queries hang because there were no tickets available, although not consistently (that's how I first noticed this problem). But maybe I just wasn't waiting long enough for the dynamic mechanism to kick in, or there was some other weird edge-case. If we always expect Execution Control to get around this issue, should I look into that further?  

Comment by Louis Williams [ 15/May/23 ]

This was discussed so some extent in SERVER-70662. We probably need to consider a different concurrency control mechanism for $search that isn't the same as the one for WiredTiger. With Execution Control, the number of tickets is no longer fixed, and it should respond to ticket exhaustion by increasing the number of tickets, at least in this case.

Generated at Thu Feb 08 06:34:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.