[SERVER-8067] incorrect use of sparse index to sort query results Created: 03/Jan/13  Updated: 28/Oct/15  Resolved: 27/Nov/13

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.2.2
Fix Version/s: 2.5.5

Type: Bug Priority: Major - P3
Reporter: Matt Hurne Assignee: Benety Goh
Resolution: Done Votes: 0
Labels: query_triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server8067.js    
Issue Links:
Duplicate
is duplicated by SERVER-13549 MongoDB 2.6 doesn't automatically use... Closed
Related
related to SERVER-11568 query optimizer fails to use index wh... Closed
Backwards Compatibility: Major Change
Operating System: ALL
Participants:

 Description   

In some cases, sorting a query results in the use of a sparse index, which in turn results in a lesser number of documents being returned than would normally be returned by the query. This strikes me as incorrect.

Example:

Suppose I have a collection "foo" that contains 1000 documents. 750 of those documents include a value for the field "bar", while the other 250 do not. Suppose I have a sparse index "bar_1" on the field "bar". The index will contain the 750 documents that have a value for the field.

The following query results in 1000 documents:
db.foo.find()

The following query results in only 750 documents, despite the identical query parameters:
db.foo.find().sort(

{ bar: 1 }

)

I understand why this behaves the way it does, but in my opinion the behavior is incorrect.



 Comments   
Comment by Githook User [ 27/Nov/13 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-8067 do not use sparse index to provide sort order
Branch: master
https://github.com/mongodb/mongo/commit/730c2513f4f0af9068ce7ba0852cb63db56d2c99

Comment by Benety Goh [ 26/Nov/13 ]

Documentation at this URL should be updated when this ticket is resolved:

http://docs.mongodb.org/manual/core/index-sparse/#sparse-index-on-a-collection-can-results-in-incomplete-results

Comment by Matt Hurne [ 03/Jan/13 ]

Related to this is the fact that count() provides an incorrect value when used on queries where the number of documents returned is affected by the sort's use of a sparse index. Extending the example above, we see:

db.foo.find().count()
1000

db.foo.find().sort(

{ bar: 1 }

).count()
1000

The latter shouldn't be allowed at all, but if it is, the count() should be 750.

Comment by Matt Hurne [ 03/Jan/13 ]

One solution would be to refuse to execute a query where the sort's use of a spare index would result in a different number of results than the unsorted query. Certainly there are cases where the query itself might use the sparse index in a way that would allow the sort to occur without affecting the number of results; such queries should be allowed.

Generated at Thu Feb 08 03:16:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.