[SERVER-81850] Use a more aggressive IDHACK for find/update/remove by _id Created: 04/Oct/23  Updated: 02/Feb/24

Status: In Progress
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Critical - P2
Reporter: Mathias Stearn Assignee: Colin Stolley
Resolution: Unresolved Votes: 0
Labels: perf-8.0, perf-tiger, perf-tiger-handoff, perf-tiger-poc, perf-tiger-q4, query-director-triage, query-perf-q4, risk
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-82865 Lightweight collection acquisition fo... Backlog
Sub-Tasks:
Key
Summary
Type
Status
Assignee
SERVER-83758 Aggressive IDHACK for find Sub-task In Code Review Colin Stolley  
SERVER-83759 Aggresive IDHACK for update Sub-task In Progress Colin Stolley  
SERVER-83760 Aggresive IDHACK for delete Sub-task Open Colin Stolley  
Assigned Teams:
Query Execution
Sprint: QE 2023-11-27, QE 2023-12-11, QE 2023-12-25, QE 2024-01-08, QE 2024-01-22, QE 2024-02-05, QE 2024-02-19
Participants:

 Description   

Right now we have IDHACK as a stage, but it still involves going through the planner and working with a PlanExecutor, WorkingSet and similar. There is at least a 10% win from making the optimization more aggressive and cutting over to dedicated C++ code that directly uses the SortedDataInterface and RecordStore APIs, and calling that rather than directly. You can see my POC patch for find in PERF-4696, but I don't think that is exactly the right path to take. I think it is worth making functions for find/update/remove by _id and using them throughout the codebase when those operations are needed.

I also prototyped a similar change for update in the update command logic. It didn't show a measurable impact on sys-perf because that is running with w:majority, j:1 which adds a ton of extra overhead and noise (some of which I'm filing other tickets about). However local testing with w:1, j:0 shows that it also experiences a 10% end-to-end improvement from using dedicated code for update by _id. The improvement should be even larger for internal update-by-id codepaths such as oplog application and StorageInterface::upsertById.

I did not prototype remove, but I assume it would benefit from a similar treatment.

A productionized version would have a few enhancements:

  • Improving the detection of "simple _id query" to include things like {_id: {$eq: 7}}
  • Splitting {_id: 1, a: 2, b: 3} into separate matches on _id (which will be handled by the index) and everything else and apply the remaining filter (if any) to the resulting doc before deciding to return it.
    • I believe this is necessary in order to use the fast path for all oplog updates. We may be able to do most updates without this enhancement, but it will need to fallback to the slow path if there is a residual query.
  • This should also work for clustered collections where it can skip the _id index and just go directly to the record store.
  • In an ideal world, the RecordStore::Cursor type (or a subclass) would be modified to support update operations rather than having them as methods on RecordStore itself since that better matches the WiredTiger APIs. We could then use a single WT/RecordStore cursor both to fetch the document and to apply the update, which will save some lookup cost.
    • This should probably also happen in the UpdateStage logic, but it may be more complicated because we would need to remove calls to WorkingSet::fetch that naturally won't exist in the dedicated IDHACK code since it will just work directly with RecordIds.


 Comments   
Comment by Xiaochen Wu [ 09/Oct/23 ]

FYI kyle.suarez@mongodb.com bernard.gorman@mongodb.com 

Comment by Mathias Stearn [ 04/Oct/23 ]

Let me know if you want me to split this into separate tickets for find/update/remove.

Generated at Thu Feb 08 06:47:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.