[SERVER-33527] Plan with complex filter can hold lock for an extended time without yielding Created: 27/Feb/18 Updated: 06/Dec/22 Resolved: 23/Mar/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Backlog - Query Team (Inactive) |
| Resolution: | Won't Fix | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
The query execution subsystem only encounters yield points in between calls to PlanStage::work(). However, match expression evaluation typically happens within a single unit of work. For instance, the FETCH stage will do the following when work() is called:
Match expression evaluation for a single document is usually quite fast. However, if the filter is extremely large or complex, match expression evaluation could take significant time. Since match expression evaluation does not yield, this could cause intent locks to be held for an extended period. Here is an example repro case using a very large $or:
Query run times appear to be quadratic:
A MODE_IX lock is held the entire time, which can block all queries if a command requiring an exclusive lock like dbStats or collStats is is executed. |
| Comments |
| Comment by Charlie Swanson [ 23/Mar/18 ] | ||||||||||||||||||||||||||||||
|
See Dave's comments above. We believe the real problem is separate, and we don't plan to add support for yielding during evaluation of a MatchExpression on a single document. | ||||||||||||||||||||||||||||||
| Comment by David Storch [ 20/Mar/18 ] | ||||||||||||||||||||||||||||||
|
I diagnosed the issue related to non-linear runtime growth for indexed $or queries. See related ticket | ||||||||||||||||||||||||||||||
| Comment by David Storch [ 20/Mar/18 ] | ||||||||||||||||||||||||||||||
|
There are two distinct problems here. The first is that the query execution system only encounters yield points in between calls to PlanStage::work(), and MatchExpression evaluation always happens within a single work() call. Therefore, any filter which is expensive can result in the lock being held for the duration of the filter's evaluation. This is rarely a problem in practice, since most filters are quite cheap. However, giant $or's or other MatchExpressions for which the query text itself is quite large can take some time to execute. The second problem is that we have identified super-linear growth in execution time for a large indexed $or. As asya wisely points out, a COLLSCAN plan with an $or filter displays linear growth in execution time as a function of the number of $or clauses. I think we can be quite confident that $or execution is O(n) in the number of clauses. On the other hand, an indexed $or plan appears to have non-linear growth. Like Asya, I reproduced this using a single document, so that the number of $or clauses grows while the number of documents being evaluated remains constant. I used this script:
See attached file IndexedOrNonLinearGrowth.pdf for a plot of the output. I will clarify this ticket to be specifically about the fact that MatchExpression evaluation does not yield. In addition, I'm going to investigate the cause of the non-linear growth pattern for an indexed $or, and will file a separate ticket describing that issue. | ||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 09/Mar/18 ] | ||||||||||||||||||||||||||||||
|
It holds a write intent (w) lock, which won't block reads and writes, until some operation comes along that needs a strong lock (R or W). That will be blocked by the non-yielding intent lock, and it will in turn block other reads and writes, for the duration of the non-yielding $or query. | ||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 09/Mar/18 ] | ||||||||||||||||||||||||||||||
|
As it turns out the issue seems to be the "consumption" of the $or query in the matcher somewhere, not the actual deletes (which is why there are no yields as the actual delete hasn't started yet). I reran the test using a collection with zero documents and collection with one document (which guarantees that the number of documents to actually examine is a non-issue), I also varied the test making it IXSCAN on _id or COLLSCAN on unindexed field:
These numbers are representative of all the runs for $or clause. They seem pretty linear to me for unindexed field, and it makes sense that having number of clauses to evaluate go up will cause the time to execute the query to go up (linearly). However for indexed field it does seem to grow quadratically - but isn't that expected because we have that many subquery plans to evaluate? Note that while "top" will show that write operation was running for however many milliseconds/seconds I see no evidence that it's holding a write lock of any sort as I was able to read and write from the collection while the long query was executing. | ||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 09/Mar/18 ] | ||||||||||||||||||||||||||||||
|
Took a few stack traces and the time all seems to be being spent in PlanStage::saveState and PlanStage::restoreState. | ||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 09/Mar/18 ] | ||||||||||||||||||||||||||||||
|
By "number of documents considered" do you mean the number of documents in the collection? In this repro at least that doesn't seem to matter - in the repro above if setup() inserts 100k documents and remove removes 10k the timing is the same as if setup() had only inserted 10k documents. | ||||||||||||||||||||||||||||||
| Comment by Andy Schwerin [ 09/Mar/18 ] | ||||||||||||||||||||||||||||||
|
The reason the query runtime in the sample looks to be quadratic is because we're increasing both the number of documents considered and the number of terms in the $or in proportion, right bruce.lucas? I'm pretty sure that the or evaluation is linear in the number of terms, and query evaluation is definitely linear in number of documents, so when both are n we get n^2 runtime. The bigger question I think is, are we yielding at the right frequency or not? I think for updates we stop yielding after the first matched document gets updated. Do we do that for multi-deletes, too? | ||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 28/Feb/18 ] | ||||||||||||||||||||||||||||||
|
The particular query used in the original repro can be rewritten using $in, and that turns out to be fast. However that's only possible in special cases; for example this query is slow and holds a lock for an extended time without yielding but can't be rewritten using $in:
|