[SERVER-65373] Wrong results in explode_for_sort_fetch.js due to incorrect runtime environment registration of repeated input param ids Created: 08/Apr/22  Updated: 29/Oct/23  Resolved: 12/Apr/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: Backlog
Fix Version/s: 6.0.0-rc0

Type: Bug Priority: Blocker - P1
Reporter: Anton Korshunov Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QO 2022-04-18
Participants:

 Comments   
Comment by Githook User [ 12/Apr/22 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-65373 Fix runtime environment slot registration when input param id appears multiple times
Branch: master
https://github.com/mongodb/mongo/commit/d4ee9dd9cf38c4261c11187fc452e16709892f22

Comment by David Storch [ 11/Apr/22 ]

The root cause of the problem is as follows. The original query has two {f: {$eq: 5}} predicates. One gets auto-parameterized to inputParamId 0 and the other to inputParamId 3. However, each of these predicates then gets copied three times as part of the "explode for sort process". As a result, each input parameter id appears 4 times in the composite query solution. So far so good.

The problem is that the SBE stage builder does not work correctly when the input parameter id appears multiple times in the QuerySolution. Each time it sees the input parameter id, it allocates a new slot for this parameter and tries to insert a new (inputParamId -> slotId) mapping into the inputParamToSlotMap. However, a mapping for the input param id might already exist. In this case, the newly allocated slot is returned to the caller despite not actually registering a mapping for this slot in inputParamToSlotMap. After adding some logging to StageBuilderState::registerInputParamSlot(), I see the following:

registering input param -> slot mapping: (0 -> 16)
registering input param -> slot mapping: (0 -> 31)
registering input param -> slot mapping: (0 -> 46)
registering input param -> slot mapping: (0 -> 61)
registering input param -> slot mapping: (3 -> 76)
registering input param -> slot mapping: (3 -> 91)
registering input param -> slot mapping: (3 -> 106)
registering input param -> slot mapping: (3 -> 121)

This means that at the end of the stage building process, only the mappings (0 -> 16) and (3 -> 76) are left in the inputParamToSlotMap, while the others are thrown away. As a result, when we later bind values into the runtime environment, we only bind values for slots s16 and s76. This is consistent with what appears in the debug output from my previous comment.

Comment by David Storch [ 08/Apr/22 ]

I think the values in the runtime environment are wrong. The plan has 8 FETCH stages which apply the predicate {f: {$eq: 5}}. This results in 8 runtime environment slots: s61, s31, s76, s106, s90, s121, s46, and s16. However, only two of these have the value of 5 whereas the other two are left as a value of Nothing. I think this explains the symptom of missing documents as well, since the value of Nothing ends up filtering out matching documents in 6 of the 8 branches of the plan.

The next step is to figure out why the SBE plan construction is going wrong.

Comment by David Storch [ 08/Apr/22 ]

The composite QuerySolution produced by SBE sub planner appears to be correct, best I can tell:

{
				"stage" : "SORT_MERGE",
				"planNodeId" : 17,
				"sortPattern" : {
					"e" : 1
				},
				"inputStages" : [
					{
						"stage" : "FETCH",
						"planNodeId" : 2,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 1,
							"keyPattern" : {
								"a" : 1,
								"b" : 1,
								"e" : 1
							},
							"indexName" : "a_1_b_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"a" : [ ],
								"b" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"a" : [
									"[1.0, 1.0]"
								],
								"b" : [
									"[3.0, 3.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 4,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 3,
							"keyPattern" : {
								"a" : 1,
								"b" : 1,
								"e" : 1
							},
							"indexName" : "a_1_b_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"a" : [ ],
								"b" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"a" : [
									"[2.0, 2.0]"
								],
								"b" : [
									"[3.0, 3.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 6,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 5,
							"keyPattern" : {
								"a" : 1,
								"b" : 1,
								"e" : 1
							},
							"indexName" : "a_1_b_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"a" : [ ],
								"b" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"a" : [
									"[1.0, 1.0]"
								],
								"b" : [
									"[4.0, 4.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 8,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 7,
							"keyPattern" : {
								"a" : 1,
								"b" : 1,
								"e" : 1
							},
							"indexName" : "a_1_b_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"a" : [ ],
								"b" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"a" : [
									"[2.0, 2.0]"
								],
								"b" : [
									"[4.0, 4.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 10,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 9,
							"keyPattern" : {
								"c" : 1,
								"d" : 1,
								"e" : 1
							},
							"indexName" : "c_1_d_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"c" : [ ],
								"d" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"c" : [
									"[1.0, 1.0]"
								],
								"d" : [
									"[3.0, 3.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 12,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 11,
							"keyPattern" : {
								"c" : 1,
								"d" : 1,
								"e" : 1
							},
							"indexName" : "c_1_d_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"c" : [ ],
								"d" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"c" : [
									"[2.0, 2.0]"
								],
								"d" : [
									"[3.0, 3.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 14,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 13,
							"keyPattern" : {
								"c" : 1,
								"d" : 1,
								"e" : 1
							},
							"indexName" : "c_1_d_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"c" : [ ],
								"d" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"c" : [
									"[1.0, 1.0]"
								],
								"d" : [
									"[4.0, 4.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"planNodeId" : 16,
						"filter" : {
							"f" : {
								"$eq" : 5
							}
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"planNodeId" : 15,
							"keyPattern" : {
								"c" : 1,
								"d" : 1,
								"e" : 1
							},
							"indexName" : "c_1_d_1_e_1",
							"isMultiKey" : false,
							"multiKeyPaths" : {
								"c" : [ ],
								"d" : [ ],
								"e" : [ ]
							},
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								"c" : [
									"[2.0, 2.0]"
								],
								"d" : [
									"[4.0, 4.0]"
								],
								"e" : [
									"[MinKey, MaxKey]"
								]
							}
						}
					}
				]
			},

I suspect that something is wrong with how we are constructing the SBE plan from this query solution. I created a formatted view of the SBE plan:

slots:
  $$RESULT=s124
  $$RID=s125
  env: {
      s61 = Nothing,
      s31 = Nothing,
      s3 = 1649454764614 (NOW),
      s1 = TimeZoneDatabase(Pacific/Marquesas...America/Indiana/Vincennes) (timeZoneDB),
      s2 = Nothing (SEARCH_META),
      s76 = 5,
      s106 = Nothing,
      s91 = Nothing,
      s121 = Nothing,
      s46 = Nothing,
      s16 = 5
  },
 
stages:
[17] unique [s125]
[17] smerge [s124, s125] [asc] [
    [s4] [s13, s14] [2] filter {fillEmpty (s18, false)}
    [2] traverse s18 s17 s15 [s13, s14, s4] {s18 || s17} {s18}
    from
        [2] project [s15 = getField (s13, \"f\")]
        [2] nlj [s4] [s9, s5, s6, s7, s8]
            left
                [1] nlj [s6, s8] [s10, s11]
                    left
                        [1] project [s6 = \"a_1_b_1_e_1\", s8 = {\"a\" : 1, \"b\" : 1, \"e\" : 1}, s10 = KS(2B022B060A0104), s11 = KS(2B022B06F0FE04)]
                        [1] limit 1
                        [1] coscan
                    right
                        [1] project [s5 = s12]
                        [1] ixseek s10 s11 s7 s9 s12 [s4 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"a_1_b_1_e_1\" true
 
 
            right
                [2] limit 1
                [2] seek s9 s13 s14 s5 s6 s7 s8 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [2] project [s17 = fillEmpty (s15 == s16, false)]
        [2] limit 1
        [2] coscan
    ,
    [s19] [s28, s29] [4] filter {fillEmpty (s33, false)}
    [4] traverse s33 s32 s30 [s29, s28, s19] {s33 || s32} {s33}
    from
        [4] project [s30 = getField (s28, \"f\")]
        [4] nlj [s19] [s24, s20, s21, s22, s23]
            left
                [3] nlj [s21, s23] [s25, s26]
                    left
                        [3] project [s21 = \"a_1_b_1_e_1\", s23 = {\"a\" : 1, \"b\" : 1, \"e\" : 1}, s25 = KS(2B042B060A0104), s26 = KS(2B042B06F0FE04)]
                        [3] limit 1
                        [3] coscan
                    right
                        [3] project [s20 = s27]
                        [3] ixseek s25 s26 s22 s24 s27 [s19 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"a_1_b_1_e_1\" true
 
 
            right
                [4] limit 1
                [4] seek s24 s28 s29 s20 s21 s22 s23 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [4] project [s32 = fillEmpty (s30 == s31, false)]
        [4] limit 1
        [4] coscan
    ,
    [s34] [s43, s44] [6] filter {fillEmpty (s48, false)}
    [6] traverse s48 s47 s45 [s43, s44, s34] {s48 || s47} {s48}
    from
        [6] project [s45 = getField (s43, \"f\")]
        [6] nlj [s34] [s39, s35, s36, s37, s38]
            left
                [5] nlj [s36, s38] [s40, s41]
                    left
                        [5] project [s36 = \"a_1_b_1_e_1\", s38 = {\"a\" : 1, \"b\" : 1, \"e\" : 1}, s40 = KS(2B022B080A0104), s41 = KS(2B022B08F0FE04)]
                        [5] limit 1
                        [5] coscan
                    right
                        [5] project [s35 = s42]
                        [5] ixseek s40 s41 s37 s39 s42 [s34 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"a_1_b_1_e_1\" true
 
 
            right
                [6] limit 1
                [6] seek s39 s43 s44 s35 s36 s37 s38 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [6] project [s47 = fillEmpty (s45 == s46, false)]
        [6] limit 1
        [6] coscan
    ,
    [s49] [s58, s59] [8] filter {fillEmpty (s63, false)}
    [8] traverse s63 s62 s60 [s59, s58, s49] {s63 || s62} {s63}
    from
        [8] project [s60 = getField (s58, \"f\")]
        [8] nlj [s49] [s54, s50, s51, s52, s53]
            left
                [7] nlj [s51, s53] [s55, s56]
                    left
                        [7] project [s51 = \"a_1_b_1_e_1\", s53 = {\"a\" : 1, \"b\" : 1, \"e\" : 1}, s55 = KS(2B042B080A0104), s56 = KS(2B042B08F0FE04)]
                        [7] limit 1
                        [7] coscan
                    right
                        [7] project [s50 = s57]
                        [7] ixseek s55 s56 s52 s54 s57 [s49 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"a_1_b_1_e_1\" true
 
 
            right
                [8] limit 1
                [8] seek s54 s58 s59 s50 s51 s52 s53 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [8] project [s62 = fillEmpty (s60 == s61, false)]
        [8] limit 1
        [8] coscan
    ,
    [s64] [s73, s74] [10] filter {fillEmpty (s78, false)}
    [10] traverse s78 s77 s75 [s74, s73, s64] {s78 || s77} {s78}
    from
        [10] project [s75 = getField (s73, \"f\")]
        [10] nlj [s64] [s69, s65, s66, s67, s68]
            left
                [9] nlj [s66, s68] [s70, s71]
                    left
                        [9] project [s66 = \"c_1_d_1_e_1\", s68 = {\"c\" : 1, \"d\" : 1, \"e\" : 1}, s70 = KS(2B022B060A0104), s71 = KS(2B022B06F0FE04)]
                        [9] limit 1
                        [9] coscan
                    right
                        [9] project [s65 = s72]
                        [9] ixseek s70 s71 s67 s69 s72 [s64 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"c_1_d_1_e_1\" true
 
 
            right
                [10] limit 1
                [10] seek s69 s73 s74 s65 s66 s67 s68 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [10] project [s77 = fillEmpty (s75 == s76, false)]
        [10] limit 1
        [10] coscan
    ,
    [s79] [s88, s89] [12] filter {fillEmpty (s93, false)}
    [12] traverse s93 s92 s90 [s89, s88, s79] {s93 || s92} {s93}
    from
        [12] project [s90 = getField (s88, \"f\")]
        [12] nlj [s79] [s84, s80, s81, s82, s83]
            left
                [11] nlj [s81, s83] [s85, s86]
                    left
                        [11] project [s81 = \"c_1_d_1_e_1\", s83 = {\"c\" : 1, \"d\" : 1, \"e\" : 1}, s85 = KS(2B042B060A0104), s86 = KS(2B042B06F0FE04)]
                        [11] limit 1
                        [11] coscan
                    right
                        [11] project [s80 = s87]
                        [11] ixseek s85 s86 s82 s84 s87 [s79 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"c_1_d_1_e_1\" true
 
 
            right
                [12] limit 1
                [12] seek s84 s88 s89 s80 s81 s82 s83 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [12] project [s92 = fillEmpty (s90 == s91, false)]
        [12] limit 1
        [12] coscan
    ,
    [s94] [s103, s104] [14] filter {fillEmpty (s108, false)}
    [14] traverse s108 s107 s105 [s104, s103, s94] {s108 || s107} {s108}
    from
        [14] project [s105 = getField (s103, \"f\")]
        [14] nlj [s94] [s99, s95, s96, s97, s98]
            left
                [13] nlj [s96, s98] [s100, s101]
                    left
                        [13] project [s96 = \"c_1_d_1_e_1\", s98 = {\"c\" : 1, \"d\" : 1, \"e\" : 1}, s100 = KS(2B022B080A0104), s101 = KS(2B022B08F0FE04)]
                        [13] limit 1
                        [13] coscan
                    right
                        [13] project [s95 = s102]
                        [13] ixseek s100 s101 s97 s99 s102 [s94 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"c_1_d_1_e_1\" true
 
 
            right
                [14] limit 1
                [14] seek s99 s103 s104 s95 s96 s97 s98 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [14] project [s107 = fillEmpty (s105 == s106, false)]
        [14] limit 1
        [14] coscan
    ,
    [s109] [s118, s119] [16] filter {fillEmpty (s123, false)}
    [16] traverse s123 s122 s120 [s119, s118, s109] {s123 || s122} {s123}
    from
        [16] project [s120 = getField (s118, \"f\")]
        [16] nlj [s109] [s114, s110, s111, s112, s113]
            left
                [15] nlj [s111, s113] [s115, s116]
                    left
                        [15] project [s111 = \"c_1_d_1_e_1\", s113 = {\"c\" : 1, \"d\" : 1, \"e\" : 1}, s115 = KS(2B042B080A0104), s116 = KS(2B042B08F0FE04)]
                        [15] limit 1
                        [15] coscan
                    right
                        [15] project [s110 = s117]
                        [15] ixseek s115 s116 s112 s114 s117 [s109 = 2] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" @\"c_1_d_1_e_1\" true
 
 
            right
                [16] limit 1
                [16] seek s114 s118 s119 s110 s111 s112 s113 [] @\"beb61540-4e95-4518-8b66-da08d4bb7a43\" true false
 
 
    in
        [16] project [s122 = fillEmpty (s120 == s121, false)]
        [16] limit 1
        [16] coscan
 
] "

Comment by David Storch [ 08/Apr/22 ]

I distilled this to the following repro:

(function() {
"use strict";
 
const coll = db.explode_for_sort_fetch;
coll.drop();
 
assert.commandWorked(coll.createIndex({a: 1, b: 1, e: 1}));
assert.commandWorked(coll.createIndex({c: -1, d: -1, e: -1}));
 
assert.commandWorked(coll.insert([
    {_id: 3, a: 1, b: 3, f: 5, e: 3},
    {_id: 4, a: 2, b: 4, f: 5, e: 4},
    {_id: 8, a: 1, b: 4, f: 5, e: 8},
    {_id: 1, a: 2, b: 3, f: 5, e: 1},
    {_id: 0, a: 2, b: 3, f: 3, e: 0},
    {_id: 7, c: 1, d: 3, f: 5, e: 7},
    {_id: 6, c: 2, d: 4, f: 5, e: 6},
    {_id: 2, c: 1, d: 4, f: 5, e: 2},
    {_id: 5, c: 2, d: 3, f: 5, e: 5},
    {_id: 9, c: 2, d: 3, f: 3, e: 9},
]));
 
const expectedResults = [
    {_id: 1, a: 2, b: 3, f: 5, e: 1},
    {_id: 2, c: 1, d: 4, f: 5, e: 2},
    {_id: 3, a: 1, b: 3, f: 5, e: 3},
    {_id: 4, a: 2, b: 4, f: 5, e: 4},
    {_id: 5, c: 2, d: 3, f: 5, e: 5},
    {_id: 6, c: 2, d: 4, f: 5, e: 6},
    {_id: 7, c: 1, d: 3, f: 5, e: 7},
    {_id: 8, a: 1, b: 4, f: 5, e: 8},
];
 
assert.eq(coll.find({
                  $or: [
                      {a: {$in: [1, 2]}, b: {$in: [3, 4]}, f: 5},
                      {c: {$in: [1, 2]}, d: {$in: [3, 4]}, f: 5}
                  ]
              })
              .sort({e: 1})
              .toArray(),
          expectedResults);
}());

Generated at Thu Feb 08 06:02:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.