-
Type: Bug
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Distributed Query Execution
-
None
-
ALL
-
QE 2022-09-19, QE 2022-11-14
According to https://www.mongodb.com/docs/manual/reference/operator/aggregation/sample/
$sample will behave differently depending on the parameters passed – random sort or random cursor.
But the behavior described in this docs only applies to executing $sample from mongod. Executing $sample from mongos to a sharded collection does not behave as described in the docs. Because:
- as a shard svr, mongod only knows the number of documents it stores
- mongos sends the $sample with size entered by the user to each shard svr
Here comes the problem. Execute $sample from mongos, and the sample size is 5% of the total number of documents in a sharded collection. It is expected to use the random cursor method, but in fact, the random sort method will be used to do the sample on the shard svr.