[SERVER-4665] Querying against sparse index creates bad state, affecting later queries Created: 12/Jan/12  Updated: 11/Jul/16  Resolved: 17/Jan/12

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 2.0.1
Fix Version/s: 2.0.3, 2.1.0

Type: Bug Priority: Major - P3
Reporter: Paul Canavese Assignee: Aaron Staple
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Shell and DB running under OSX.


Attachments: Text File mongo_bug.txt    
Operating System: OS X
Participants:

 Description   

I have run into a case where querying against a sparse index on a nested field puts the DB in a bad state that affects later queries on the same collection. I originally discovered this problem using the ruby driver, but was able to reproduce the problem completely from the mongo console (see below). I inserted comments, prefixed with #. The mongo_bug db did not exist prior to this example.

> use mongo_bug
switched to db mongo_bug
> db.bug_collection.ensureIndex(

{"nested.key2": 1}

,

{sparse: true}

)
> db.bug_collection.insert(

{key: 'a'}

)
> db.bug_collection.insert(

{key: 'b'}

)
> db.bug_collection.insert(

{key: 'c'}

)
> db.bug_collection.find({})

{ "_id" : ObjectId("4f0e7daf9be3426d0ba2e7b8"), "key" : "a" } { "_id" : ObjectId("4f0e7db59be3426d0ba2e7b9"), "key" : "b" } { "_id" : ObjectId("4f0e7dba9be3426d0ba2e7ba"), "key" : "c" }

> db.bug_collection.find({key: {$in: ['a', 'b', 'c']}, "nested.key2": {$ne: 'd'}})

  1. No results returned (expected, because of the sparse index?)
    > db.bug_collection.find({key: {$in: ['a', 'b', 'c']}})
  2. No results returned (NOT expected)
    > db.bug_collection.dropIndex("nested.key2_1") { "nIndexesWas" : 2, "ok" : 1 }

    > db.bug_collection.find({key: {$in: ['a', 'b', 'c']}})

    { "_id" : ObjectId("4f0e7daf9be3426d0ba2e7b8"), "key" : "a" } { "_id" : ObjectId("4f0e7db59be3426d0ba2e7b9"), "key" : "b" } { "_id" : ObjectId("4f0e7dba9be3426d0ba2e7ba"), "key" : "c" }
  3. Query now works, after the index has been dropped.

The order of the queries also matters:

> db.bug_collection.dropIndex("nested.key2_1")

{ "nIndexesWas" : 2, "ok" : 1 }

> db.bug_collection.ensureIndex(

{"nested.key2": 1}

,

{sparse: true}

)
> db.bug_collection.find({key: {$in: ['a', 'b', 'c']}})

{ "_id" : ObjectId("4f0e811f89fc95d0a1a0356b"), "key" : "a" } { "_id" : ObjectId("4f0e812389fc95d0a1a0356c"), "key" : "b" } { "_id" : ObjectId("4f0e812589fc95d0a1a0356d"), "key" : "c" }
  1. Results returned (expected)
    > db.bug_collection.find({key: {$in: ['a', 'b', 'c']}, "nested.key2": {$ne: 'd'}}) { "_id" : ObjectId("4f0e811f89fc95d0a1a0356b"), "key" : "a" } { "_id" : ObjectId("4f0e812389fc95d0a1a0356c"), "key" : "b" } { "_id" : ObjectId("4f0e812589fc95d0a1a0356d"), "key" : "c" }
  2. Results returned (this same query returns nothing in first example)
    > db.bug_collection.dropIndex("nested.key2_1") { "nIndexesWas" : 2, "ok" : 1 }

    > db.bug_collection.ensureIndex(

    {"nested.key2": 1}

    ,

    {sparse: true}

    )
    > db.bug_collection.find({key: {$in: ['a', 'b', 'c']}, "nested.key2": {$ne: 'd'}})

  3. No results returned (as in first example)
    > db.bug_collection.find({key: {$in: ['a', 'b', 'c']}})
  4. No results returned (as in first example)

It's as if the queries are considered equivalent and being cached.



 Comments   
Comment by auto [ 17/Jan/12 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-4665 backport 'ConstraintPresent' constraint type and exclusion of unhelpful recorded plans
Branch: v2.0
https://github.com/mongodb/mongo/commit/0b38e1009b7e43b84fd791cfc58bf83fb3c6d0eb

Comment by auto [ 16/Jan/12 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-4665 avoid using an unhelpful cached plan
Branch: master
https://github.com/mongodb/mongo/commit/4992d8a270d44301c74fb2facbf1468406a77fa4

Comment by auto [ 16/Jan/12 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-4665 simplify QueryPattern model by separating Empty from UpperAndLowerBound constraint types
Branch: master
https://github.com/mongodb/mongo/commit/870fd44d145621ed5d6c05bd0311c2b277a9ee37

Comment by auto [ 16/Jan/12 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-4665 account for $ne bounds in QueryPattern
Branch: master
https://github.com/mongodb/mongo/commit/f08b9980df7667642af651a02a3bffd748b82072

Comment by Aaron Staple [ 12/Jan/12 ]

Hi Paul,

I'll try to give a brief overview of what's happening here. Mongo uses a sampling based mechanism to determine which indexes seem effective for resolving particular queries. But we don't use sampling all the time - we also keep a record of indexes that seem to perform well for particular queries and do query pattern matching to attempt to retry those performant indexes for subsequent queries that are similar, but not identical, to earlier queries.

In this case the $ne part of your query is being ignored for pattern matching, so the index that worked for your first query is being reused for your second query. And since that index is sparse, using it can alter the result of the query.

We should probably include your $ne field in the pattern matching model, but will need to figure out exactly how to do that. When we decide how to do that, I'll link to the corresponding ticket.

Comment by Paul Canavese [ 12/Jan/12 ]

I also should note that everything works fine if the index is not sparse.

Comment by Paul Canavese [ 12/Jan/12 ]

The formatting didn't come out well. Uploading the commands/commentary as an attachment.

Generated at Thu Feb 08 03:06:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.