[SERVER-14585] Cursor explain strange output : QueryOptimizerCursor Created: 16/Jul/14  Updated: 10/Dec/14  Resolved: 17/Jul/14

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sylvain Coulombel Assignee: David Storch
Resolution: Done Votes: 0
Labels: cursor, explain, limit, queryoptimisercursor
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7


Operating System: Windows
Participants:

 Description   

I created an index on a field

db.bookingCollection.create_index([("IDs.code", pymongo.ASCENDING),("dates.begin", pymongo.ASCENDING)])

Then I run an explain on a specific query and force it to use my previously created index :

cursor = db.bookingCollection.find({"IDs.code": "XXYY00", "dates.begin": date}).hint([("IDs.code", pymongo.ASCENDING),("dates.begin", pymongo.ASCENDING)]).sort([("customers.last_name", pymongo.ASCENDING)]).limit(100)
e = cursor.explain()

in e for the "cursor" e['cursor'] field I get QueryOptimizerCursor.

I did not manage to find any documentation on that QueryOptimizerCursor. cursor here http://docs.mongodb.org/manual/reference/method/cursor.explain/#cursor.explain.

If I remove the limit field I get :

BtreeCursor proptyIDs.propty_code_1_room_stay.dates.begin_1

That's what I expect and not the query optimizer cursor.

Also I repeat the same experience with the limit field but with mongoDB 2.4
I got

BtreeCursor proptyIDs.propty_code_1_room_stay.dates.begin_1

That what I expect too

I am wondering if it is a bug or if something is missing in the doc.

Regards,

Sylvain



 Comments   
Comment by Sylvain Coulombel [ 18/Jul/14 ]

Hi,

Thanks a lot for your quick feedback and detailed explanations !

Best,
Sylvain

Comment by David Storch [ 17/Jul/14 ]

Hi scoulombel,

Explain reports "QueryOptimizerCursor" for the cursor field for any query plan which uses an OR stage. You are seeing "QueryOptimizerCursor" for your query because the solution generated by the 2.6 planner includes an OR.

So the question becomes, why is the query plan doing an OR if there is no $or specified in the original query? The answer is that this is a consequence of a historical limitation of the MongoDB wire protocol---namely, that there are not separate fields for limit and batchSize. Instead of passing to the server both a limit and a batchSize, the drivers just pass a single value called "ntoreturn" (you may have seen this value reported in mongod's slow query log lines). For queries without a .sort(), the server doesn't much care whether ntoreturn is really a limit or a batchSize.

When there is both a .sort() and an ntoreturn, however, things get a lot trickier. If the ntoreturn is a limit, then the server should do a topK sort. For instance, suppose that find(<blah blah blah>).sort({a: 1}) would cause a large amount of data to be sorted in memory (potentially hitting the server's memory limit for sorts). If you only want the first 10 of the sorted results, you should be able to type find(<blah blah blah>).sort({a: 1}).limit(10) and the server should be able to answer this query without hitting the memory limit. It does so with a topK sort, only buffering up to 10 documents in memory at a time. If the ntoreturn is a batchSize, however, mongod will need to return all the results in sorted order. This requires a full in-memory sort (rather than a topK sort).

Here's the punchline. The server gets around the ambiguity of whether or not to do a topK sort by generating an OR plan which first does a topK, and then switches over to a full in-memory sort if more results are requested by the client. This behavior is new in 2.6 (added to fix several bugs surrounding sort, limit, and batchSize), which is why you never would have seen "QueryOptimizerCursor" for a query like this in 2.4.

Given that this is expected behavior, I'm going to close this ticket as "Works as Designed", but feel free to reach out with any further questions or concerns.

Best,
Dave

Comment by Sylvain Coulombel [ 16/Jul/14 ]

Hi,
Yes you are right, thanks. I put the question in PyMongo because it is the driver I am using but it might probably not be related to this specific driver. I should try from the console to check if the results are the same.
Sylvain

Comment by Bernie Hackett [ 16/Jul/14 ]

Hi,

I've moved this ticket to the core server project since the question is unrelated to PyMongo. This may be related to the query engine refactor in MongoDB 2.6.

Generated at Thu Feb 08 03:35:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.