[SERVER-15214] $nin operator avoid query execution over a specific index Created: 11/Sep/14  Updated: 29/Jan/15  Resolved: 25/Sep/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 2.6.3, 2.6.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: flavio alberti Assignee: Unassigned
Resolution: Duplicate Votes: 6
Labels: nin, performance, query
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-12281 When choosing multikey index bounds, ... Backlog
Operating System: ALL
Steps To Reproduce:

create some documents

for (i = 0; i < 100000; i++) {
db.test26.insert(
    {"_id" : new ObjectId, 
        "user" : "mario", "removed" : false, "objId" : i, "created" : new Date(),
        "p" : { "id" : ["content"+i,"content"+(i+1),"content"+(i+2)]   }, 
        "a" : "attributea"+i,"b" : "attributeb"+i,"c" : "attributec"+i}   );
    }

create some indexes

db.test26.ensureIndex({ "user" : 1,"removed":1,"a" : 1 });
db.test26.ensureIndex({ "user" : 1,"removed":1,"b" : 1 });
db.test26.ensureIndex({ "user" : 1,"removed":1,"c" : 1 });
db.test26.ensureIndex({ "user" : 1,"removed":1,"p.id" : 1 });

Participants:

 Description   

summary

A query with $in and $nin operator does not work over the special index. This query works correctly in mongodb version 2.4.3

db.test26.find({"user":"mario","removed":false,"p.id":{"$nin":["content5"],"$in":["content1","content2","content3"]}
 }  
).explain()

the query should run over the index user_1_removed_1_p.id_1, but it works only if is not present the $nin operator.
This is the output of explain function

{
    "cursor" : "BtreeCursor user_1_removed_1_a_1",
    "isMultiKey" : false,
    "n" : 0,
    "nscannedObjects" : 100000,
    "nscanned" : 100000,
    "nscannedObjectsAllPlans" : 194167,
    "nscannedAllPlans" : 219855,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 2234,
    "nChunkSkips" : 0,
    "millis" : 789,
    "indexBounds" : {
        "user" : [ 
            [ 
                "mario", 
                "mario"
            ]
        ],
        "removed" : [ 
            [ 
                false, 
                false
            ]
        ],
        "a" : [ 
            [ 
                {
                    "$minElement" : 1
                }, 
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
    "server" : "PZNVWINDEV-011:27017",
    "filterSet" : false,
    "stats" : {
        "type" : "KEEP_MUTATIONS",
        "works" : 101084,
        "yields" : 2234,
        "unyields" : 2234,
        "invalidates" : 0,
        "advanced" : 0,
        "needTime" : 100000,
        "needFetch" : 1083,
        "isEOF" : 1,
        "children" : [ 
            {
                "type" : "FETCH",
                "works" : 101084,
                "yields" : 2234,
                "unyields" : 2234,
                "invalidates" : 0,
                "advanced" : 0,
                "needTime" : 100000,
                "needFetch" : 1083,
                "isEOF" : 1,
                "alreadyHasObj" : 0,
                "forcedFetches" : 0,
                "matchTested" : 0,
                "children" : [ 
                    {
                        "type" : "IXSCAN",
                        "works" : 100000,
                        "yields" : 2234,
                        "unyields" : 2234,
                        "invalidates" : 0,
                        "advanced" : 100000,
                        "needTime" : 0,
                        "needFetch" : 0,
                        "isEOF" : 1,
                        "keyPattern" : "{ user: 1.0, removed: 1.0, a: 1.0 }",
                        "boundsVerbose" : "field #0['user']: [\"mario\", \"mario\"], field #1['removed']: [false, false], field #2['a']: [MinKey, MaxKey]",
                        "isMultiKey" : 0,
                        "yieldMovedCursor" : 0,
                        "dupsTested" : 0,
                        "dupsDropped" : 0,
                        "seenInvalidated" : 0,
                        "matchTested" : 0,
                        "keysExamined" : 100000,
                        "children" : []
                    }
                ]
            }
        ]
    }
}



 Comments   
Comment by Ramon Fernandez Marina [ 25/Sep/14 ]

flavio@alicubi.net, what you're seeing is expected behavior: the last index is multikey, which means that index bounds can't be intersected for the "p.id" field. This means that the bounds for "p.id" must either be ["content1", "content1"], ["content2", "content2"], ["content3", "content3"] or [MinKey, "content5"), ("content5", MaxKey]. As of right now, 2.6 is choosing the latter, less efficient bounds, so the query planner chooses a different, more efficient plan for the query.

That being said, there may be room for improvement as described in SERVER-12281, so feel free to follow it for updates on this area.

Regards,
Ramón.

Comment by flavio alberti [ 11/Sep/14 ]

To be clear, the $nin operator avoid query optimization if applied over an array field

{
    "_id" : ObjectId("541161567e097992f5461f23"),
    "user" : "mario",
    "removed" : false,
    "p" : {
        "id" : [ 
            "content0", 
            "content1", 
            "content2"
        ]
    },
...
}

Generated at Thu Feb 08 03:37:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.