[SERVER-20639] Stage $sample requires sort option allowDiskUse:true Created: 25/Sep/15  Updated: 09/Jul/16  Resolved: 25/Sep/15

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.1.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Asya Kamsky Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Just tried to run

{ aggregate: "bigarray", pipeline: [ { $sample: { size: 1000 } } ], cursor: {} }

and got error:

Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.

It's not clear why aggregation sorting is involved in $sample - if that's expected it needs to be prominently documented, but if it shouldn't be using sort then this is a bug.



 Comments   
Comment by Charlie Swanson [ 25/Sep/15 ]

This is the expected behavior. In order to get n random documents, we do a collection scan, and sort by an injected random value (which is later removed), then select the top n. I'll close this as 'Works as Designed', but this does need better documentation, and has been marked as such.

Generated at Thu Feb 08 03:54:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.