[SERVER-14585] Cursor explain strange output : QueryOptimizerCursor Created: 16/Jul/14 Updated: 10/Dec/14 Resolved: 17/Jul/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sylvain Coulombel | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | cursor, explain, limit, queryoptimisercursor | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 7 |
||
| Operating System: | Windows |
| Participants: |
| Description |
|
I created an index on a field
Then I run an explain on a specific query and force it to use my previously created index :
in e for the "cursor" e['cursor'] field I get QueryOptimizerCursor. I did not manage to find any documentation on that QueryOptimizerCursor. cursor here http://docs.mongodb.org/manual/reference/method/cursor.explain/#cursor.explain. If I remove the limit field I get :
That's what I expect and not the query optimizer cursor. Also I repeat the same experience with the limit field but with mongoDB 2.4
That what I expect too I am wondering if it is a bug or if something is missing in the doc. Regards, Sylvain |
| Comments |
| Comment by Sylvain Coulombel [ 18/Jul/14 ] |
|
Hi, Thanks a lot for your quick feedback and detailed explanations ! Best, |
| Comment by David Storch [ 17/Jul/14 ] |
|
Hi scoulombel, Explain reports "QueryOptimizerCursor" for the cursor field for any query plan which uses an OR stage. You are seeing "QueryOptimizerCursor" for your query because the solution generated by the 2.6 planner includes an OR. So the question becomes, why is the query plan doing an OR if there is no $or specified in the original query? The answer is that this is a consequence of a historical limitation of the MongoDB wire protocol---namely, that there are not separate fields for limit and batchSize. Instead of passing to the server both a limit and a batchSize, the drivers just pass a single value called "ntoreturn" (you may have seen this value reported in mongod's slow query log lines). For queries without a .sort(), the server doesn't much care whether ntoreturn is really a limit or a batchSize. When there is both a .sort() and an ntoreturn, however, things get a lot trickier. If the ntoreturn is a limit, then the server should do a topK sort. For instance, suppose that find(<blah blah blah>).sort({a: 1}) would cause a large amount of data to be sorted in memory (potentially hitting the server's memory limit for sorts). If you only want the first 10 of the sorted results, you should be able to type find(<blah blah blah>).sort({a: 1}).limit(10) and the server should be able to answer this query without hitting the memory limit. It does so with a topK sort, only buffering up to 10 documents in memory at a time. If the ntoreturn is a batchSize, however, mongod will need to return all the results in sorted order. This requires a full in-memory sort (rather than a topK sort). Here's the punchline. The server gets around the ambiguity of whether or not to do a topK sort by generating an OR plan which first does a topK, and then switches over to a full in-memory sort if more results are requested by the client. This behavior is new in 2.6 (added to fix several bugs surrounding sort, limit, and batchSize), which is why you never would have seen "QueryOptimizerCursor" for a query like this in 2.4. Given that this is expected behavior, I'm going to close this ticket as "Works as Designed", but feel free to reach out with any further questions or concerns. Best, |
| Comment by Sylvain Coulombel [ 16/Jul/14 ] |
|
Hi, |
| Comment by Bernie Hackett [ 16/Jul/14 ] |
|
Hi, I've moved this ticket to the core server project since the question is unrelated to PyMongo. This may be related to the query engine refactor in MongoDB 2.6. |