[SERVER-12438] batch size with an unindexed sort in the new query system is inconsistent with the old behavior Created: 22/Jan/14  Updated: 11/Jul/16  Resolved: 29/Jan/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.5.4
Fix Version/s: 2.5.5

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-14228 Setting batchSize and sort on a curso... Closed
Related
related to SERVER-13316 sorts with multiple batches with smal... Closed
is related to SERVER-17011 Cursor can return objects out of orde... Closed
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

Old behavior (2.4.x)

We have the following collection of 4 documents and no indices:

> db.t.find()
{ "_id" : 1, "a" : 1 }
{ "_id" : 2, "a" : 2 }
{ "_id" : 3, "a" : 3 }
{ "_id" : 4, "a" : 4 }

If we set batch size and request an unindexed sort, then the server will return just a single batch. This is done so that the server can perform a top k sort.

> db.t.find().sort({a: 1}).batchSize(2)
{ "_id" : 1, "a" : 1 }
{ "_id" : 2, "a" : 2 }

If we set batch size, and there is index that will provide the sort, then as many batches are returned as required to fully answer the query:

> db.t.ensureIndex({a: 1})
> db.t.find().sort({a: 1}).batchSize(2)
{ "_id" : 1, "a" : 1 }
{ "_id" : 2, "a" : 2 }
{ "_id" : 3, "a" : 3 }
{ "_id" : 4, "a" : 4 }

If an index is available which can provide the sort, then the server will always select the plan with the indexed sort. This is key to the old behavior: even if there is a plan with a blocking sort stage that is more efficient, the plan with the indexed sort is preferred.

New behavior (e.g. 2.5.4)

In the case that

  1. the batch size is set,
  2. a sort is requested, and
  3. there is an index that provides the sort,

the server may or may not select a query plan with a blocking sort. As a consequence, there are some cases in which we expect to get all results from a query back, but instead a plan with a blocking sort is selected and we end up with only one batch.



 Comments   
Comment by Githook User [ 20/Mar/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-12438 better handling of batchSize and limit with sort

(cherry picked from commit b9167f0fe82160967e591aefba5824a8a372353d)
Branch: v2.6
https://github.com/mongodb/mongo/commit/d3b45e4327368e0a44634b9c5288d729e706975e

Comment by Githook User [ 20/Mar/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-12438 better handling of batchSize and limit with sort
Branch: master
https://github.com/mongodb/mongo/commit/b9167f0fe82160967e591aefba5824a8a372353d

Comment by Githook User [ 29/Jan/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-12438 avoid unindexed sort if batch size is set
Branch: master
https://github.com/mongodb/mongo/commit/f2ece186b189b2b9d09636c708f768fb3822b511

Generated at Thu Feb 08 03:28:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.