-
Type:
Task
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Product Performance
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Problem
On every OperationContext destruction, WiredTigerConnection::releaseSession calls session->closeAllCursors("") before returning the session to the pool. This pushes all cursors from MongoDB's per-session _cursors list into WiredTiger's internal cursor cache. The next request that reuses the same session calls session->open_cursor(), which retrieves the cursor from WiredTiger's cache via wt_cursor_cache_get and _curfile_reopen. This close→cache→reopen cycle is pure overhead: the cursor was already reset (cursor->reset() is called in releaseCursor() before placement in MongoDB's list), fully idle, and ready for the next operation with no WiredTiger round-trip required. CPU profiling of the `ycsb.in_cache.100read.2024-05` workload at 128 threads shows ~1.6% flat CPU attributable to this cycle across _curfile_close (0.21%), _curfile_cache (0.22%), _wt_cursor_cache_get (0.28%), _curfile_reopen (0.28%), and _curfile_reset (0.12%), with 88.82% of cursor construction time going through WiredTiger's _session_open_cursor rather than MongoDB's getCachedCursor() — confirming MongoDB's cache is emptied on every OpCtx destruction.
Solution
Remove the session->closeAllCursors("") and invariant(session->cachedCursors() == 0) calls from WiredTigerConnection::_releaseSession. MongoDB's per-session _cursors list survives across session pool round-trips: the next request that reuses the same session hits getCachedCursor() directly with no WiredTiger API call. To preserve correct DDL semantics, a new method WiredTigerConnection::closePooledCursorsForUri(uri) is added — it holds _cacheLock and calls session->closeAllCursors(uri) on every idle pooled session. This method is called before every exclusive DDL operation (dropIdent, dropIdentForImport, verifyTable, _salvageIfNeeded, recoverOrphanedIdent) so that parked cursors release their session_ref counts on the target data-handle before WiredTiger's exclusive-lock protocol begins. The DDL sweep is O(pool_size × cursors_per_session) and is negligible relative to the cost of DDL operations themselves; compact and alter do not require exclusive dhandle access and need no sweep.