-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.6.0-rc0
-
Component/s: Querying
-
ALL
-
In the attached 'Test 1' case, v2.6.0-rc0 selects a different query plan than v2.4.9, and v2.6.0-rc0 takes significantly longer to execute. The indexed key is a string with many duplicates.
In 'Test 2', the indexed key is an integer with many duplicates, and v2.6.0-rc0 is faster than v2.4.9, and selects the same query plan.
Note: the two repro cases attached are identical, except for the value type of 'a'
Explain for test 1:
> db.version(); 2.6.0-rc1-pre- > db.t.find({a:"1"}).sort({d:1}).explain(); { "cursor" : "BtreeCursor a_1_b_1", "isMultiKey" : false, "n" : 10000, "nscannedObjects" : 10000, "nscanned" : 10000, "nscannedObjectsAllPlans" : 10098, "nscannedAllPlans" : 10098, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 158, "nChunkSkips" : 0, "millis" : 50, "indexBounds" : { "a" : [ [ "1", "1" ] ], "b" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "zzyzx:27018", "filterSet" : false } > db.version(); 2.4.9 > db.t.find({a:"1"}).sort({d:1}).explain(); { "cursor" : "BtreeCursor d_1_e_1", "isMultiKey" : false, "n" : 10000, "nscannedObjects" : 11000, "nscanned" : 11000, "nscannedObjectsAllPlans" : 13203, "nscannedAllPlans" : 13203, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 35, "indexBounds" : { "d" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ], "e" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "zzyzx:27018" }
Explain for test 2:
> db.version(); 2.6.0-rc1-pre- > db.t.find({a:1}).sort({d:1}).explain(); { "cursor" : "BtreeCursor d_1_e_1", "isMultiKey" : false, "n" : 10001, "nscannedObjects" : 11000, "nscanned" : 11000, "nscannedObjectsAllPlans" : 11197, "nscannedAllPlans" : 11198, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 87, "nChunkSkips" : 0, "millis" : 14, "indexBounds" : { "d" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ], "e" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "zzyzx:27018", "filterSet" : false } > db.version(); 2.4.9 > db.t.find({a:1}).sort({d:1}).explain(); { "cursor" : "BtreeCursor d_1_e_1", "isMultiKey" : false, "n" : 10001, "nscannedObjects" : 11000, "nscanned" : 11000, "nscannedObjectsAllPlans" : 13201, "nscannedAllPlans" : 13201, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 24, "indexBounds" : { "d" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ], "e" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "zzyzx:27018" }
FWIW, the delta in the NLTK c++ test is an order of magnitude, though the dataset varies widely (see QA/QA-401/QA401.cpp for details):
> db.regressions.findOne({$nor:[{qry:/where/}, {qry:/exists/}]}) { "_id" : ObjectId("530f6bad7e6c53525919dfad"), "deltaRatio" : "9.55", "deltaUs" : 917.67, "v249Avg" : "107.33", "v249Results" : [ 106, 108, 109, 106, 107, 108 ], "rc0Avg" : "1025.00", "rc0Results" : [ 1005, 1081, 1011, 1026, 1013, 1014 ], "testId" : "GeneratedTests::GeneratedQuery607[Compound1]", "qry" : "{ query: { a: \"\" }, orderby: { d: 1 } }", "proj" : "{ NONE: 1 }", "idx" : "Compound1", "rc0Count" : 0, "v249Count" : 0, "countsMatch" : true }