[SERVER-37530] Provide a way to cause a well-defined order of evaluation for predicates Created: 09/Oct/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Yuta Arai Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 1
Labels: afz, mql-semantics, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-37432 Re-enable aggregation_wildcard_fuzzer... Closed
Duplicate
is duplicated by SERVER-45231 Provide a way to cause a well-defined... Closed
Related
related to SERVER-52619 Performance Regression on 4.4 compare... Closed
related to SERVER-45364 Query Planner should estimate cost of... Backlog
related to SERVER-45231 Provide a way to cause a well-defined... Closed
is related to SERVER-45123 Queries should not fail with type err... Closed
is related to SERVER-59146 Enable push down of $match with $expr Backlog
is related to DOCS-13309 MongoDB does not guarantee short circ... Closed
is related to SERVER-45308 Alphabetical order of field names use... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Steps To Reproduce:

(function(){
	"use strict";
 
	const docs = [
		{num: 0},
		{num: 1},
	]
 
	const agg = [
	    {$match: {num: {$ne: 0}}},
	    {$match: {$or: [
	    	{$expr: {$divide: [5, '$num']}},
			{b: {$exists: true}},
	    ]}},
	   	{$project: {_id: 0}},
	];
 
	const s1 = MongoRunner.runMongod({binVersion: 'latest'});
	const s2 = MongoRunner.runMongod({binVersion: 'latest'});
 
	let colls = [];
	colls[0] = s1.getDB('test').c;
	colls[1] = s2.getDB('test').c;
 
	for (let i = 0; i < colls.length; ++i) {
		colls[i].drop();
	}
 
	function doActionForAllElementsOnAllCollection(colls, elements, action){
		for (let i = 0; i < colls.length; ++i) {
			for (let j = 0; j < elements.length; ++j) {
				action(colls[i], elements[j]);
			}
		}
	}
 
	assert.commandWorked(s1.getDB('test').adminCommand({setParameter: 1, internalQueryAllowAllPathsIndexes: true}));
 
	// Insert documents.
	doActionForAllElementsOnAllCollection(colls, docs, (coll, element) => 
		assert.commandWorked(coll.insert(element)));
 
	// Create indexes.
	assert.commandWorked(colls[0].createIndex({"num":1}));
 
	try {
		let err1, err2;
		let result1, result2;
 
		try {
			result1 = colls[0].aggregate(agg).toArray();
		} catch (err) {
			err1 = err.message;
		}
 
		try {
			result2 = colls[1].aggregate(agg).toArray();
		} catch (err) {
			err2 = err.message;
		}
 
		print("Fuzzer Explain");
		printjson(colls[0].explain().aggregate(agg));
		printjson(colls[1].explain().aggregate(agg));
		print("Fuzzer Test Errors");
		print(err1);
		print(err2);
		print("Fuzzer Test Results");
		printjson(result1);
		printjson(result2);
		assert(err1 === err2);
		assert((result1 == undefined && result2 == undefined) || bsonBinaryEqual(result1, result2));
 
 
	} finally {
		MongoRunner.stopMongod(s1);
		MongoRunner.stopMongod(s2);
	}
})();
 

Participants:
Case:
Linked BF Score: 17

 Description   

Currently we freely reorder predicates for rewrite and optimization for the sake of performance. We could provide a way to define the order certain predicates are run in so any side effects from them (such as errors being produced) are predictable

Here is a description of a situation that shows our current behavior:

There is an inconsistency in how we are handling an error for a specific query involving $expr for when we use an index vs when we do a collection scan. The inconsistency stems from two behaviors of our query system:

1. A match stage can be pushed down to hide an error
Suppose there are two $match stages, (a) one with an expression that can throw an error for a specific input and (b) the other $match stage filters out those specific error-causing inputs. When (a) is before (b) but there is an index that matches the query for (b), the query system will push that stage to the IXSCAN and run (b) before (a) which will suppress that error.

Example:

const docs = [
	{num: 0},
	{num: 1},
];
const agg = [
    {$match: {$expr: {$divide: [5, '$num']}}}, // (a) throws when $num is 0.
    {$match: {num: {$ne: 0}}}, // (b) filters out 0.
];

The aggregation above will return

{num: 0}

instead of throwing the divide by zero error, because the $ne stage is pushed down to index scan:

"winningPlan" : {
        "stage" : "FETCH",
        "filter" : {
                "$expr" : {
                        "$divide" : [
                                {
                                        "$const" : 5
                                },
                                "$num"
                        ]
                }
        },
        "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                        "num" : 1
                },
                "indexName" : "num_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                        "num" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                        "num" : [
                                "[MinKey, 0.0)",
                                "(0.0, MaxKey]"
                        ]
                }
        }
}

2. Collection scans can form canonicalize queries to throw error
For collection scan, the opposite of the case above can happen where when (b) is placed before (a), but (a) is an $or containing a $expr that throws. In this scenario, the query system will absorb the two match stages into one and put the two predicates in an $and with a specific order. When this happens, (a) will be reordered to be placed before (b) and thus the query doing a collection scan will throw an error.

Given these two behaviors, the results for the following queries are inconsistent between an index scan and collection scan. Index scan will succeed while a collection scan will throw a divide by zero error.

const docs = [
	{num: 0},
	{num: 1},
];
 
const agg1 = [
    {$match: {num: {$ne: 0}}}, // collection scan moves this match down.
    {$match: {$or: [
    	{$expr: {$divide: [5, '$num']}}, // (a) throws when $num is 0.
	{a: {$exists: true}},
    ]}},
];
 
const agg2 = [
    {$match: {$or: [
    	{$expr: {$divide: [5, '$num']}}, // (a) throws when $num is 0.
	{a: {$exists: true}},
    ]}},
    {$match: {num: {$ne: 0}}},  // index scan moves this match up.
];

The explain outputs for both of two queries will be the following:

// Collection scan
"winningPlan" : {
        "stage" : "COLLSCAN",
        "filter" : {
                "$and" : [
                        {
                                "$or" : [
                                        {
                                                "b" : {
                                                        "$exists" : true
                                                }
                                        },
                                        {
                                                "$expr" : {
                                                        "$divide" : [
                                                                {
                                                                        "$const" : 5
                                                                },
                                                                "$num"
                                                        ]
                                                }
                                        }
                                ]
                        },
                        {
                                "$nor" : [
                                        {
                                                "num" : {
                                                        "$eq" : 0
                                                }
                                        }
                                ]
                        }
                ]
        },
        "direction" : "forward"
}
 
// Index scan
"winningPlan" : {
        "stage" : "FETCH",
        "filter" : {
                "$or" : [
                        {
                                "b" : {
                                        "$exists" : true
                                }
                        },
                        {
                                "$expr" : {
                                        "$divide" : [
                                                {
                                                        "$const" : 5
                                                },
                                                "$num"
                                        ]
                                }
                        }
                ]
        },
        "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                        "num" : 1
                },
                "indexName" : "num_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                        "num" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                        "num" : [
                                "[MinKey, 0.0)",
                                "(0.0, MaxKey]"
                        ]
                }
        }
}

This inconsistency also occurs for an $and expression in the same conditions.



 Comments   
Comment by James Wahlin [ 24/Oct/18 ]

Done

Comment by Charlie Swanson [ 23/Oct/18 ]

james.wahlin given the above can I suggest renaming this ticket to something like "Planning optimizations can hide error scenarios"?

Comment by James Wahlin [ 23/Oct/18 ]

After further discussion we are agreed that this problem is broader than $expr optimization. Query and aggregation optimization allow for a reordering of operations. This reordering makes it possible that an optimized statement can match a document in a manner that short-circuits executing an additional statement that would produce an error. This allows for different behavior in optimized vs non-optimized execution and for indexes to impact behavior via related optimizations.

An example of where we could hypothetically trigger this in aggregation is as follows. In the following, if we were to strictly evaluate order of operation of the pipeline stages, the $addFields would trigger a divide by 0 error. Instead, we push the $match stage into the query system, filtering out the document {x: 0} prior to applying the $addFields.

> db.test.insert({x: 0})
WriteResult({ "nInserted" : 1 })
> db.test.insert({x: 1})
WriteResult({ "nInserted" : 1 })
> db.test.aggregate([{$addFields: {y: {$divide: [10, '$x']}}}, {$match: {'x': 1}}, {$sort: {x: -1}}])
{ "_id" : ObjectId("5bcf93fac3537a9c823b01d0"), "x" : 1, "y" : 10 }
> 

Given the above we are going to avoid attempting a quick-fix and will instead look for / consider broader solutions.

Comment by Asya Kamsky [ 19/Oct/18 ]

To sum up - it's the clauses being in different order that makes it fail or succeed with and without index, here is reproducer for find:

// with index on num
db.bar.find(a1)
{ "_id" : ObjectId("5bc9e9732b05d4253636ba7d"), "num" : 1 }
test@ak:PRIMARY:41100(4.1.1) > db.bar.find(a2)
{ "_id" : ObjectId("5bc9e9732b05d4253636ba7d"), "num" : 1 }
test@ak:PRIMARY:41100(4.1.1) > db.bar.dropIndexes()
{
	"nIndexesWas" : 2,
	"msg" : "non-_id indexes dropped for collection",
	"ok" : 1,
	"operationTime" : Timestamp(1539979360, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1539979360, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}
test@ak:PRIMARY:41100(4.1.1) > db.bar.find(a1)
Error: error: {
	"operationTime" : Timestamp(1539979360, 1),
	"ok" : 0,
	"errmsg" : "can't $divide by zero",
	"code" : 16608,
	"codeName" : "Location16608",
	"$clusterTime" : {
		"clusterTime" : Timestamp(1539979360, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}
test@ak:PRIMARY:41100(4.1.1) > db.bar.find(a2)
Error: error: {
	"operationTime" : Timestamp(1539979360, 1),
	"ok" : 0,
	"errmsg" : "can't $divide by zero",
	"code" : 16608,
	"codeName" : "Location16608",
	"$clusterTime" : {
		"clusterTime" : Timestamp(1539979360, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

a1 and a2 are from explain above so they differ in order of clauses in $and only:

 a1
{
	"$and" : [
		{
			"num" : {
				"$ne" : 0
			}
		},
		{
			"$or" : [
				{
					"$expr" : {
						"$divide" : [
							5,
							"$num"
						]
					}
				},
				{
					"a" : {
						"$exists" : true
					}
				}
			]
		}
	]
}
> a2
{
	"$and" : [
		{
			"$or" : [
				{
					"$expr" : {
						"$divide" : [
							5,
							"$num"
						]
					}
				},
				{
					"a" : {
						"$exists" : true
					}
				}
			]
		},
		{
			"num" : {
				"$ne" : 0
			}
		}
	]
}

Comment by Asya Kamsky [ 19/Oct/18 ]

Here is the example I had in mind:

db.bar.find({$and:[ {$expr:{$gt:["$num",0]}}, {$expr:{$gt:[{$divide:[5, "$num"]},3]}}]})
{ "_id" : ObjectId("5bc9e9732b05d4253636ba7d"), "num" : 1 }
db.bar.find({$and:[ {$expr:{$gt:[{$divide:[5, "$num"]},3]}}, {$expr:{$gt:["$num",0]}}]})
Error: error: {
	"operationTime" : Timestamp(1539978833, 1),
	"ok" : 0,
	"errmsg" : "can't $divide by zero",
	"code" : 16608,
	"codeName" : "Location16608",
	"$clusterTime" : {
		"clusterTime" : Timestamp(1539978833, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

The difference here is the order of clauses in the $and array - so I actually think this is a bigger problem that this is just an example of (the example being that optimization makes evaluation of a particular stage happen first).

Comment by David Storch [ 19/Oct/18 ]

Assigning to james.wahlin to investigate what a fix would look like.

Comment by Asya Kamsky [ 19/Oct/18 ]

Just to be clear - I don't think this is agg related, this happens for complex expression in find as well if we evaluate clauses in different order.

Generated at Thu Feb 08 04:46:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.