[SERVER-62988] Investigate avoiding an extra document fetch on bounded collection scans Created: 26/Jan/22  Updated: 23/Jan/24

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Josef Ahmad Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: clustered_collections, time-series
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-67786 CollectionScan stage triggers redunda... Closed
Related
related to SERVER-85465 Improve RecordStore seekNear API In Progress
Assigned Teams:
Storage Execution
Participants:

 Description   

Cluster key lookups fetch an extra document, compared to index scans which only (walk the key and) only fetch the documents that match the predicate. The example below respectively fetches 2 documents to return 1 on a bounded collscan for a clustered collection, and walks 1 key to fetch 1 document and return 1 document on an index scan for a non-clustered collection.

// Clustered collection
{"command":{"find":"c","filter":{"_id":"a"}},"planSummary":"COLLSCAN",
    "keysExamined":0,"docsExamined":2,"nreturned":1}
// Non-clustered collection
{"command":{"find":"c","filter":{"_id":"a"}},"planSummary":"IXSCAN { _id: 1 }",
    "keysExamined":1,"docsExamined":1,"nreturned":1}

While clustered collection queries generally outperform non-clustered collection queries as they avoid the index lookup, we should explore avoiding the extra document fetch as an additional performance improvement.



 Comments   
Comment by Steve Tarzia [ 15/Nov/22 ]

joe.sack@mongodb.com any guidance on the priority of this?

Comment by Ana Meza [ 10/Nov/22 ]

As part of the TS perf work PM-3050, the team also implemented the missing piece for PM-2556: Support clustered collections in multiplanner for all collections - So we are closing PM-2556 and sending this SERVER ticket back to the Triage queue

Comment by Connie Chen [ 18/Apr/22 ]

Passing this along to Query as there is planned work for 7.0 around clustered collections for the Query Roadmap

 

Comment by Connie Chen [ 07/Feb/22 ]

We should make sure to include Query on the code review for this

Generated at Thu Feb 08 05:56:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.