-
Type: Improvement
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Aggregation Framework, Storage, WiredTiger
-
None
-
Fully Compatible
-
Quint Iteration 7
The initial implementation of $sample will involve a collection scan if it is the first stage in the pipeline. This could be dramatically improved if we exposed a way for storage engines to provide a random cursor that would do something more efficient like a random walk on a B-Tree to get pseudo-random results.
This will only track the storage engine API and WiredTiger implementation of getRandomCursor(), work on integrating this into the aggregation pipeline will be tracked on SERVER-19182
- depends on
-
WT-2032 WT_CURSOR.next with random configuration and insert-list only trees
- Closed
- is depended on by
-
SERVER-17688 Add wiredTiger support to return more than one cursor for parallelCollectionScan
- Closed
-
SERVER-19182 Integrate storage engine optimizations into $sample stage
- Closed
- is related to
-
SERVER-533 Aggregation stage to randomly sample documents
- Closed