[SERVER-39295] Use readOnce: true for $sample cursors Created: 31/Jan/19 Updated: 06/Dec/22 |
|
| Status: | Blocked |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.1.7 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Luke Prochazka | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||||||||||
| Sprint: | Query 2020-03-23 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
This is a feature request to enhance the behaviour of the $sample aggregation command by adding to the plan optimizer the WiredTiger “readOnce: true” option for MongoDB cursors ( The intended purpose behind this enhancement is so $sample does not (or is less likely to) cache the result set. A sample by definition is unlikely to be used again by subsequent samples, thereby caching has no benefit and only serves to add unwanted cache pressure and workload contention. |
| Comments |
| Comment by Ruoxin Xu [ 25/Mar/20 ] |
|
Back to open. Requires performance investigation for readOnce cursors (see |
| Comment by Eric Milkie [ 24/Mar/20 ] |
|
Just for clarification, the readOnce flag is not something that will avoid in-memory cache for the pages read. Instead, it adjusts the score on such pages so that they are evicted from cache sooner than the typical LRU algorithm would. Because $sample and random cursors potentially read multiple pages per cursor advance, it's not clear the benefits from the hastening of eviction of such pages would be a good enough tradeoff for the performance hit such cursor reads would experience. |
| Comment by Eric Milkie [ 24/Mar/20 ] |
|
Unfortunately, after some discussion with Execution and Storage Engines team members, I believe we'll have to abandon this change to use readOnce. The feature is too unstable for us to be comfortable that it will improve performance, and in some cases we have seen that it greatly hurts performance for the scanning cursor itself with no corresponding performance increase for other readers and writers. The WiredTiger storage engine cache management system is very complex, so it is hard to anticipate what the repercussions of using readOnce cursors will be for all the workloads we are interested in. |