-
Type:
Task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Product Performance
-
Fully Compatible
-
0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Problem
Profiling evidence from YCSB (100% read, in-cache) on perf-3-node-replSet.arm.aws.2024-05 shows that combined cursor wrapper lifecycle operations account for ~1.39% of total CPU per request. Every YCSB read calls WiredTigerIndex::findLoc (7.13% cum), which allocates a full WiredTigerIdIndexCursor wrapper via newCursor() (13.50% of findLoc) and destroys it immediately after (6.15%). This wrapper construction performs an operator new, deep-copies _uri and _indexName as std::string, and looks up the cached WT cursor — all for a single point lookup. Furthermore, seekExact internally uses a lower-bound + cur->next() B-tree traversal pattern, which is inefficient for a simple _id point lookup.
Solution
Override WiredTigerIdIndex::findLoc with a purpose-built implementation that bypasses the WiredTigerIndexCursorBase wrapper entirely. By obtaining a WiredTigerCursor directly from the session cache (no operator new/delete, no std::string copies), and replacing the bound-based seekExact with a single c->search() call, the read path executes the point lookup as a direct exact B-tree traversal. This aligns with the established idiom already used in _unindex, _insert, and dupKeyCheck, which all bypass WiredTigerIndexCursorBase and operate on raw WT cursors.