[SERVER-61939] Explore bounding clustered collection scans more tightly Created: 07/Dec/21  Updated: 29/Oct/23  Resolved: 08/Feb/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0

Type: Task Priority: Major - P3
Reporter: Josef Ahmad Assignee: Daniel Gomez Ferro
Resolution: Fixed Votes: 0
Labels: PM-2311-M2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2022-01-10, Execution Team 2022-01-24, Execution Team 2022-02-07, Execution Team 2022-02-21
Participants:

 Description   

A range query using an index like find({a:{$lte:"yadda")}} generates tight bounds:

				"indexBounds" : {
					"a" : [
						"[\"\", \"yadda\"]"
					]

By contrast, a range query by the cluster key like find({_id:{$lte:"yadda")}} generates only a single bound:

		"winningPlan" : {
			"stage" : "COLLSCAN",
			"filter" : {
				"_id" : {
					"$lte" : "yadda"
				}
			},
			"direction" : "forward",
			"maxRecord" : "yadda"
		},

This range query can be quite inefficient as it fetches any document of lower data type, like numeric types.



 Comments   
Comment by Githook User [ 07/Feb/22 ]

Author:

{'name': 'Daniel Gómez Ferro', 'email': 'daniel.gomezferro@mongodb.com', 'username': 'dgomezferro'}

Message: SERVER-61939 Tighter bounds for clustered collection scans
Branch: master
https://github.com/mongodb/mongo/commit/7a2d86c376488cc756c0325e6edaf3406a86ec5d

Comment by Haley Connelly [ 07/Dec/21 ]

Background: Different BSON types are encoded differently, (and implicitly have values greater than or less than other types. ie) Date type is encoded as larger than numeric type)

Currently, for TTL deletions, we construct a minRecord from the min DateType for the bounded collection scan. A RecordId helper that generates min/max bounds for each type would be helpful.

Generated at Thu Feb 08 05:53:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.