[SERVER-14423] Plans which fetch different numbers of documents can tie Created: 02/Jul/14  Updated: 11/Sep/23  Resolved: 11/Sep/23

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Optimization
Resolution: Duplicate Votes: 7
Labels: asya, bonsai, pull-request, query-44-grooming, storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-57988 Unfortunate index selection leads to ... Closed
is duplicated by SERVER-79400 Implement number of documents tie bre... Closed
Related
is related to SERVER-13675 Plans with differing performance can ... Closed
is related to SERVER-14311 skipping of index keys is not account... Closed
is related to SERVER-37194 Number of docsExamined is not conside... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Sprint: Query 2019-04-08, Query 2019-07-29, Query 2019-08-12, Query 2019-08-26, Query 2019-09-09
Participants:
Case:
Linked BF Score: 0

 Description   

The nscanned value is the same for both plans, but we choose the one with a much higher nscannedObjects:

t.drop();
t.ensureIndex({type: 1});
t.ensureIndex({type: 1, folder: 1});
for (var i = 0; i < 110; i++) { t.insert({type: "folder", folder: "b"}); }
t.insert({type: "folder", folder: "m"});
t.find({type: "folder", folder: /m/}).explain(true)

I get the following explain output from a 2.6.3 server:

{
	"cursor" : "BtreeCursor type_1",
	"isMultiKey" : false,
	"n" : 1,
	"nscannedObjects" : 111,
	"nscanned" : 111,
	"nscannedObjectsAllPlans" : 112,
	"nscannedAllPlans" : 222,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 1,
	"nChunkSkips" : 0,
	"millis" : 1,
	"indexBounds" : {
		"type" : [
			[
				"folder",
				"folder"
			]
		]
	},
	"allPlans" : [
		{
			"cursor" : "BtreeCursor type_1",
			"isMultiKey" : false,
			"n" : 1,
			"nscannedObjects" : 111,
			"nscanned" : 111,
			"scanAndOrder" : false,
			"indexOnly" : false,
			"nChunkSkips" : 0,
			"indexBounds" : {
				"type" : [
					[
						"folder",
						"folder"
					]
				]
			}
		},
		{
			"cursor" : "BtreeCursor type_1_folder_1",
			"isMultiKey" : false,
			"n" : 1,
			"nscannedObjects" : 1,
			"nscanned" : 111,
			"scanAndOrder" : false,
			"indexOnly" : false,
			"nChunkSkips" : 0,
			"indexBounds" : {
				"type" : [
					[
						"folder",
						"folder"
					]
				],
				"folder" : [
					[
						"",
						{
 
						}
					],
					[
						/m/,
						/m/
					]
				]
			}
		}
	],
	"server" : "Macintosh.local:27017",
	"filterSet" : false,
	"stats" : {
		"type" : "KEEP_MUTATIONS",
		"works" : 113,
		"yields" : 1,
		"unyields" : 1,
		"invalidates" : 0,
		"advanced" : 1,
		"needTime" : 110,
		"needFetch" : 0,
		"isEOF" : 1,
		"children" : [
			{
				"type" : "FETCH",
				"works" : 112,
				"yields" : 1,
				"unyields" : 1,
				"invalidates" : 0,
				"advanced" : 1,
				"needTime" : 110,
				"needFetch" : 0,
				"isEOF" : 1,
				"alreadyHasObj" : 0,
				"forcedFetches" : 0,
				"matchTested" : 1,
				"children" : [
					{
						"type" : "IXSCAN",
						"works" : 111,
						"yields" : 1,
						"unyields" : 1,
						"invalidates" : 0,
						"advanced" : 111,
						"needTime" : 0,
						"needFetch" : 0,
						"isEOF" : 1,
						"keyPattern" : "{ type: 1.0 }",
						"boundsVerbose" : "field #0['type']: [\"folder\", \"folder\"]",
						"isMultiKey" : 0,
						"yieldMovedCursor" : 0,
						"dupsTested" : 0,
						"dupsDropped" : 0,
						"seenInvalidated" : 0,
						"matchTested" : 0,
						"keysExamined" : 111,
						"children" : [ ]
					}
				]
			}
		]
	}
}

This happens because the plans tie: both return the same number of documents in the same amount of work cycles, even though one plan has to fetch more documents.



 Comments   
Comment by Alexander Ignatyev [ 11/Sep/23 ]

The fix was implemented in PM-3316 via SERVER-79400.

Comment by David Storch [ 11/Sep/19 ]

Hi krk,

I have reviewed this pull request and determined that unfortunately we cannot accept it. While the code itself looks good, there are a few problems around the behavior it implements:

  • Changing our cost formula for ranking plans is liable to destabilize existing workloads, and should be done with extreme caution. In the absence of tests, it's hard for us to validate whether or not this has the intended performance improvement without regressing other workloads. We need to make some investments in our internal benchmarking in order to build our own confidence with such changes.
  • A more conservative implementation for this ticket would use docsExamined only as a tie-breaking measure. This is less likely to destabilize existing workloads. Even this, however, is risky. We have an ongoing internal discussion about this issue which questions what approach we should take for this ticket. I can provide more technical details about this if you wish, but the gist is this: The tie tends to happen between 1) a plan which is not covered but scans an index sequentially and 2) a covered plan which must perform repeated index seeks. We're unsure how to model the benefits of covering against those of sequential index access.
  • There are other stages besides FETCH which can examine documents. Perhaps we could reuse the existing code for gathering the number of examined documents from the PlanStageStats tree in order to avoid duplicating this logic?

My apologies for this conclusion, and thanks for your interest in contributing to MongoDB! If you are interested in making other contributions, myself or one of my colleagues can suggest improvements or features that are less risky and thus easier for us to accept.

Best,
Dave

Comment by Kerem Kat [ 13/Jun/19 ]

I have created a PR to test the idea of including docsExamined into the plan ranker: https://github.com/mongodb/mongo/pull/1314

It is a score in [-1, 0], higher examined docs de-ranking the plan. With the commit, type_1_folder_1 always wins:

D2 QUERY [conn2] Scoring query plan: IXSCAN { type: 1, folder: 1 } planHitEOF=1
D2 QUERY [conn2] score(1.0003) = baseScore(1) + productivity((1 advanced)/(112 works) = 0.0089285714285714281) + docsExamined(-0.0089285714285714281) + tieBreakers(0.0001 noFetchBonus + 0.0001 noSortBonus + 0.0001 noIxisectBonus = 0.00030000000000000003)
D5 QUERY [conn2] score = 1.0003
D5 QUERY [conn2] Adding +1 EOF bonus to score.D2 QUERY [conn2] Scoring query plan: IXSCAN { type: 1 } planHitEOF=1

 

D2 QUERY [conn2] Scoring query plan: IXSCAN { type: 1 } planHitEOF=1
D2 QUERY [conn2] score(0.018157142857142795) = baseScore(1) + productivity((1 advanced)/(112 works) = 0.0089285714285714281) + docsExamined(-0.9910714285714286) + tieBreakers(0.0001 noFetchBonus + 0.0001 noSortBonus + 0.0001 noIxisectBonus = 0.00030000000000000003)
D5 QUERY [conn2] score = 0.0181571
D5 QUERY [conn2] Adding +1 EOF bonus to score.

 

If this is in the right direction, what other things need to change? If not, how would you implement this?

 

Comment by James Wahlin [ 30/Nov/18 ]

Another use case has surfaced which would benefit from taking document fetch into account during plan selection. As of SERVER-3645 we perform shard filtering on sharded count operations. In the case where we perform a count on a prefix of the shard key, and where more than one index exists on the shard with that prefix, it is possible that we will choose an index other than the shard key index. This can result in document fetch in order to satisfy the shard key filter, where use of the shard key index would allow for a covered operation.

Using both keysExamined and docsExamined when executing tie-break between plans would allow us to always choose the covered plan.

 

 

Comment by David Storch [ 18/Oct/16 ]

james.wahlin, based on the allPlansExecution output from that explain it appears that the plans still tie. When plans tie, we choose a winner arbitrarily; we just happen to be getting the right one. If I invert the order in which the indexes are built, as in the following repro script, then the wrong plan is still selected:

t.drop();
t.ensureIndex({type: 1, folder: 1});
t.ensureIndex({type: 1});
for (var i = 0; i < 110; i++) { t.insert({type: "folder", folder: "b"}); }
t.insert({type: "folder", folder: "m"});
t.find({type: "folder", folder: /m/}).explain(true)

Comment by James Wahlin [ 10/Oct/16 ]

Running this on 3.4.0-rc0 I find that the correct index is chosen for test case provided:

{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.test",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [
				{
					"type" : {
						"$eq" : "folder"
					}
				},
				{
					"folder" : /m/
				}
			]
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"filter" : {
					"folder" : /m/
				},
				"keyPattern" : {
					"type" : 1,
					"folder" : 1
				},
				"indexName" : "type_1_folder_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"type" : [ ],
					"folder" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"type" : [
						"[\"folder\", \"folder\"]"
					],
					"folder" : [
						"[\"\", {})",
						"[/m/, /m/]"
					]
				}
			}
		},
		"rejectedPlans" : [
			{
				"stage" : "FETCH",
				"filter" : {
					"folder" : /m/
				},
				"inputStage" : {
					"stage" : "IXSCAN",
					"keyPattern" : {
						"type" : 1
					},
					"indexName" : "type_1",
					"isMultiKey" : false,
					"multiKeyPaths" : {
						"type" : [ ]
					},
					"isUnique" : false,
					"isSparse" : false,
					"isPartial" : false,
					"indexVersion" : 2,
					"direction" : "forward",
					"indexBounds" : {
						"type" : [
							"[\"folder\", \"folder\"]"
						]
					}
				}
			}
		]
	},
	"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 1,
		"executionTimeMillis" : 0,
		"totalKeysExamined" : 111,
		"totalDocsExamined" : 1,
		"executionStages" : {
			"stage" : "FETCH",
			"nReturned" : 1,
			"executionTimeMillisEstimate" : 0,
			"works" : 113,
			"advanced" : 1,
			"needTime" : 110,
			"needYield" : 0,
			"saveState" : 1,
			"restoreState" : 1,
			"isEOF" : 1,
			"invalidates" : 0,
			"docsExamined" : 1,
			"alreadyHasObj" : 0,
			"inputStage" : {
				"stage" : "IXSCAN",
				"filter" : {
					"folder" : /m/
				},
				"nReturned" : 1,
				"executionTimeMillisEstimate" : 0,
				"works" : 112,
				"advanced" : 1,
				"needTime" : 110,
				"needYield" : 0,
				"saveState" : 1,
				"restoreState" : 1,
				"isEOF" : 1,
				"invalidates" : 0,
				"keyPattern" : {
					"type" : 1,
					"folder" : 1
				},
				"indexName" : "type_1_folder_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"type" : [ ],
					"folder" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"type" : [
						"[\"folder\", \"folder\"]"
					],
					"folder" : [
						"[\"\", {})",
						"[/m/, /m/]"
					]
				},
				"keysExamined" : 111,
				"seeks" : 1,
				"dupsTested" : 0,
				"dupsDropped" : 0,
				"seenInvalidated" : 0
			}
		},
		"allPlansExecution" : [
			{
				"nReturned" : 1,
				"executionTimeMillisEstimate" : 0,
				"totalKeysExamined" : 111,
				"totalDocsExamined" : 111,
				"executionStages" : {
					"stage" : "FETCH",
					"filter" : {
						"folder" : /m/
					},
					"nReturned" : 1,
					"executionTimeMillisEstimate" : 0,
					"works" : 112,
					"advanced" : 1,
					"needTime" : 110,
					"needYield" : 0,
					"saveState" : 1,
					"restoreState" : 1,
					"isEOF" : 1,
					"invalidates" : 0,
					"docsExamined" : 111,
					"alreadyHasObj" : 0,
					"inputStage" : {
						"stage" : "IXSCAN",
						"nReturned" : 111,
						"executionTimeMillisEstimate" : 0,
						"works" : 112,
						"advanced" : 111,
						"needTime" : 0,
						"needYield" : 0,
						"saveState" : 1,
						"restoreState" : 1,
						"isEOF" : 1,
						"invalidates" : 0,
						"keyPattern" : {
							"type" : 1
						},
						"indexName" : "type_1",
						"isMultiKey" : false,
						"multiKeyPaths" : {
							"type" : [ ]
						},
						"isUnique" : false,
						"isSparse" : false,
						"isPartial" : false,
						"indexVersion" : 2,
						"direction" : "forward",
						"indexBounds" : {
							"type" : [
								"[\"folder\", \"folder\"]"
							]
						},
						"keysExamined" : 111,
						"seeks" : 1,
						"dupsTested" : 0,
						"dupsDropped" : 0,
						"seenInvalidated" : 0
					}
				}
			},
			{
				"nReturned" : 1,
				"executionTimeMillisEstimate" : 0,
				"totalKeysExamined" : 111,
				"totalDocsExamined" : 1,
				"executionStages" : {
					"stage" : "FETCH",
					"nReturned" : 1,
					"executionTimeMillisEstimate" : 0,
					"works" : 112,
					"advanced" : 1,
					"needTime" : 110,
					"needYield" : 0,
					"saveState" : 1,
					"restoreState" : 1,
					"isEOF" : 1,
					"invalidates" : 0,
					"docsExamined" : 1,
					"alreadyHasObj" : 0,
					"inputStage" : {
						"stage" : "IXSCAN",
						"filter" : {
							"folder" : /m/
						},
						"nReturned" : 1,
						"executionTimeMillisEstimate" : 0,
						"works" : 112,
						"advanced" : 1,
						"needTime" : 110,
						"needYield" : 0,
						"saveState" : 1,
						"restoreState" : 1,
						"isEOF" : 1,
						"invalidates" : 0,
						"keyPattern" : {
							"type" : 1,
							"folder" : 1
						},
						"indexName" : "type_1_folder_1",
						"isMultiKey" : false,
						"multiKeyPaths" : {
							"type" : [ ],
							"folder" : [ ]
						},
						"isUnique" : false,
						"isSparse" : false,
						"isPartial" : false,
						"indexVersion" : 2,
						"direction" : "forward",
						"indexBounds" : {
							"type" : [
								"[\"folder\", \"folder\"]"
							],
							"folder" : [
								"[\"\", {})",
								"[/m/, /m/]"
							]
						},
						"keysExamined" : 111,
						"seeks" : 1,
						"dupsTested" : 0,
						"dupsDropped" : 0,
						"seenInvalidated" : 0
					}
				}
			}
		]
	},
}

Comment by Asya Kamsky [ 12/Sep/14 ]

This does not happen if the second where predicate is folder:/^m/ or folder:"m" so it seems to be limited to cases where we don't expect to use the index? (even though we clearly are?)

Generated at Thu Feb 08 03:34:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.