[SERVER-14740] Query rewrite of special $or leaf case to rooted $or not working for nested expressions Created: 31/Jul/14  Updated: 10/Dec/14  Resolved: 07/Aug/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 2.4.2
Fix Version/s: None

Type: Bug Priority: Trivial - P5
Reporter: flavio alberti Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-14130 Query planner could consider addition... Closed
Related
is related to SERVER-13732 Predicates in top-level implicit AND ... Closed
is related to SERVER-12594 subtrees of and/or can reduce to ixscans Backlog
Operating System: Linux
Steps To Reproduce:

1) create a testindex collection
2) insert some documents

db.testindex.insert([
{   
    "user" : "mario",
    "removed" : false,
    "objId" : "299939",
    "p" : {
        "id" : "content1"
    },
    "a" : "attributea",
    "b" : "attributeb",
    "c" : "attributec"
},
{
    "user" : "mario",
    "removed" : false,
    "objId" : "299938",
    "p" : {
        "id" : "content2"
    },
    "a" : "attributea",
    "b" : "attributeb",
    "c" : "attributec"
},
{
    "user" : "mario",
    "removed" : false,
    "objId" : "1129229",
    "p" : {
        "id" : "content3"
    },
    "a" : "attributea",
    "b" : "attributeb",
    "c" : "attributec"
},
{
    "user" : "mario",
    "removed" : false,
    "objId" : "6600954",
    "p" : {
        "id" : "content4"
    },
    "a" : "attributea",
    "b" : "attributeb",
    "c" : "attributec"
},
{
    "user" : "mario",
    "removed" : false,
    "objId" : "8889998",
    "p" : {
        "id" : "content5"
    },
    "a" : "attributea",
    "b" : "attributeb",
    "c" : "attributec"
}]
)

3) create some indexes

//generic ingex on collection
db.testindex.ensureIndex({ "user" : 1,"removed":1,"_id" : 1 }); 
// index over the objId
db.testindex.ensureIndex({ "user" : 1,"removed":1,"objId" : 1 });
//index over p.id
db.testindex.ensureIndex({ "user" : 1,"removed":1,"p.id" : 1 });

4) optimized query

db.testindex.find({"user":"mario","removed":false, "$or" : [ { "objId" : "299939"} , { "p.id" : "543534"}]}) // the query use the two indexes user_1_removed_1_objId_1 and user_1_removed_1_p.id_1
 
// explain result
{
    "clauses" : [ 
        {
            "cursor" : "BtreeCursor user_1_removed_1_objId_1",
            "isMultiKey" : false,
            "n" : 1,
            "nscannedObjects" : 1,
            "nscanned" : 1,
            "nscannedObjectsAllPlans" : 1,
            "nscannedAllPlans" : 1,
            "scanAndOrder" : false,
            "indexOnly" : false,
            "nYields" : 0,
            "nChunkSkips" : 0,
            "millis" : 0,
            "indexBounds" : {
                "user" : [ 
                    [ 
                        "mario", 
                        "mario"
                    ]
                ],
                "removed" : [ 
                    [ 
                        false, 
                        false
                    ]
                ],
                "objId" : [ 
                    [ 
                        "299939", 
                        "299939"
                    ]
                ]
            }
        }, 
        {
            "cursor" : "BtreeCursor user_1_removed_1_p.id_1",
            "isMultiKey" : false,
            "n" : 0,
            "nscannedObjects" : 0,
            "nscanned" : 0,
            "nscannedObjectsAllPlans" : 0,
            "nscannedAllPlans" : 0,
            "scanAndOrder" : false,
            "indexOnly" : false,
            "nYields" : 0,
            "nChunkSkips" : 0,
            "millis" : 0,
            "indexBounds" : {
                "user" : [ 
                    [ 
                        "mario", 
                        "mario"
                    ]
                ],
                "removed" : [ 
                    [ 
                        false, 
                        false
                    ]
                ],
                "p.id" : [ 
                    [ 
                        "543534", 
                        "543534"
                    ]
                ]
            }
        }
    ],
    "n" : 1,
    "nscannedObjects" : 1,
    "nscanned" : 1,
    "nscannedObjectsAllPlans" : 1,
    "nscannedAllPlans" : 1,
    "millis" : 0,
   
}	

5) Incapsulate the or inside an other $or or $and

db.testindex.find({
    "user":"mario","removed":false,     
    "$and" : [ {
    "$or" : [ { "objId" : "299939"} , { "p.id" : "543534"}]}
    ]}
    ).explain()
	
	
{
    "cursor" : "BtreeCursor user_1_removed_1__id_1",
    "isMultiKey" : false,
    "n" : 1,
    "nscannedObjects" : 5,
    "nscanned" : 5,
    "nscannedObjectsAllPlans" : 5,
    "nscannedAllPlans" : 5,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "user" : [ 
            [ 
                "mario", 
                "mario"
            ]
        ],
        "removed" : [ 
            [ 
                false, 
                false
            ]
        ],
        "_id" : [ 
            [ 
                {
                    "$minElement" : 1
                }, 
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
   
}	

Mongodb is not able to rewrite the nested $or into the canonical rooted $or form, thus can't make use of the index union.

Participants:

 Description   

1) db.testindex.find({"user":"mario","removed":false, "$or" : [ { "objId" : "299939"} , { "p.id" : "543534"}]})
 
2) db.testindex.find({"user":"mario","removed":false,
"$and" : [ {"$or" : [ { "objId" : "299939"} , { "p.id" : "543534"}]} ]})
 
3) db.testindex.find({"user":"mario","removed":false,
"$or" : [ {"$or" : [ { "objId" : "299939"} , { "p.id" : "543534"}]} ]})

These three queries are semantically the same, but only the 1st can perform over the indexes user_1_removed_1_objId_1 and user_1_removed_1_p.id_1

The nested $and / $or operators block the mongodb query rewrite optimization.



 Comments   
Comment by Thomas Rueckstiess [ 31/Jul/14 ]

Hi Flavio,

Your first (optimized) query falls into a specific category that had special-case code in 2.4 to deal with. A query of the shape

{ a:1, $or: [ {b:1}, {c:1} ] }

was internally rewritten to

{ $or: [ {a:1, b:1}, {a:1, c:1} ] } 

making the query a root-level $or for which the union of the two individual query results can be calculated. Adding more $and / $or clauses to the query, for example

{ a:1, {$or: [ {$or: [ {b:1}, {c:1} ] } ] }

would semantically not change it, but the query optimizer would no longer consider this form for the rewrite.

As we have rewritten the entire query engine for 2.6, it's unlikely that we can backport any further optimizations around this and it is unfortunately up to you to simplify the query to the simpler form.

Please also be aware that due to the significant changes in 2.6, the special logic to rewrite { a:1, { $or: [b:1, c:1] } } to a root-level $or has not been re-implemented in an effort to keep the code clean and general and not add to many special cases. In order to get the same behavior in 2.6, you would have to transform the query yourself to move the $or to the root-level, as in { $or: [ {a:1, b:1}, {a:1, c:1} ] }. So far, to my knowledge we did not have any reports where this has been a problem, but we are tracking the issue under SERVER-13732. If you rely on this feature, please feel free to vote up that ticket and leave a comment explaining your use case and need for it.

I hope this explains why you are seeing this behavior and our motivation for the change in 2.6.

Regards,
Thomas

Generated at Thu Feb 08 03:35:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.