[SERVER-68623] Executing $sample from mongos on sharded collection has unexpected behavior Created: 08/Aug/22  Updated: 28/Sep/22  Resolved: 28/Sep/22

Status: Closed
Project: Core Server
Component/s: Distributed Query Execution
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: beat jean Assignee: Denis Grebennicov
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done after DOCS-15635 Clarify $sample on sharded clusters. Backlog
Operating System: ALL
Sprint: QE 2022-09-19, QE 2022-11-14
Participants:

 Description   

According to https://www.mongodb.com/docs/manual/reference/operator/aggregation/sample/

$sample will behave differently depending on the parameters passed – random sort or random cursor. 

But the behavior described in this docs only applies to executing $sample from mongod. Executing $sample from mongos to a sharded collection does not behave as described in the docs. Because: 

  • as a shard svr, mongod only knows the number of documents it stores
  • mongos sends the $sample with size entered by the user to each shard svr

Here comes the problem. Execute $sample from mongos, and the sample size is 5% of the total number of documents in a sharded collection. It is expected to use the random cursor method, but in fact, the random sort method will be used to do the sample on the shard svr.

 



 Comments   
Comment by Sebastien Mendez [ 28/Sep/22 ]

Hi beatjean1314@gmail.com

We confirm that there is no bug to fix, so we're closing this SERVER ticket.
However we have created a ticket to clarify the documentation (DOCS-15635), feel free to follow it.

Regards,
Seb.

Comment by Denis Grebennicov [ 19/Sep/22 ]

You are right beatjean1314@gmail.com. I guess one should adjust the documentation saying that the decision making process (which sampling strategy to pick) is done based on local information of every corresponding shard.

Comment by beat jean [ 18/Sep/22 ]

Yes, I mean for mongos, numberOfRecords is 151 and sample size is 5. According to the docs https://www.mongodb.com/docs/manual/reference/operator/aggregation/sample/, all three conditions are met, then random cursor should be used for sampling, but shard B still uses random sorting for sampling.

Comment by Denis Grebennicov [ 14/Sep/22 ]

I just wrote a sample test to exercise different setups and explain outputs for $sample stage in sharded cluster for sharded collections.

Prior to executing $sample stage on every shard, one will check for the conditions for the random cursor. The numberOfRecords will be used based on the number of records stored on that particular mongod instance and not the total number of records on all shards. Meaning that if we have a 151 documents, where 101 are stored on shard A and 50 are stored on shard B, then shard A will perform sampling through the help of random cursor, while shard B will read all documents from the collection and perform random sorting.

Regardless of the work performed on the shard, the merged step will perform the random sorting of the results from both shards and taking only (limit) the number of documents specified in the $sample.size.

When the collection is not sharded, then $sample stage behaves as it would be in a regular replica-set/standalone setup. 

Does this answer the quesion chris.kelly@mongodb.com  beatjean1314@gmail.com?

Here you can see the explain output:

{
  "serverInfo" : {
    "host" : "ip-10-122-5-229",
    "port" : 20049,
    "version" : "6.2.0-alpha",
    "gitVersion" : "unknown"
  },
  "serverParameters" : {
    "internalQueryFacetBufferSizeBytes" : 104857600,
    "internalQueryFacetMaxOutputDocSizeBytes" : 104857600,
    "internalLookupStageIntermediateDocumentMaxSizeBytes" : 104857600,
    "internalDocumentSourceGroupMaxMemoryBytes" : 104857600,
    "internalQueryMaxBlockingSortMemoryUsageBytes" : 104857600,
    "internalQueryProhibitBlockingMergeOnMongoS" : 0,
    "internalQueryMaxAddToSetBytes" : 104857600,
    "internalDocumentSourceSetWindowFieldsMaxMemoryBytes" : 104857600
  },
  "mergeType" : "mongos",
  "splitPipeline" : {
    "shardsPart" : [
      {
      "$sample" : {
        "size" : NumberLong(5)
      }
    },
      {
      "$limit" : NumberLong(5)
    }
    ],
    "mergerPart" : [
      {
      "$mergeCursors" : {
        "lsid" : {
          "id" : UUID("8bec1e97-75d4-41e9-88d8-06b291aa1ad7"),
          "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
        },
        "sort" : {
          "$rand" : -1
        },
        "compareWholeSortKey" : false,
        "tailableMode" : "normal",
        "nss" : "test.foo",
        "allowPartialResults" : false,
        "recordRemoteOpWaitTime" : false
      }
    },
      {
      "$limit" : NumberLong(5)
    }
    ]
  },
  "shards" : {
    "sample_testing-rs0" : {
      "host" : "ip-10-122-5-229:20040",
      "stages" : [
        {
        "$cursor" : {
          "queryPlanner" : {
            "namespace" : "test.foo",
            "indexFilterSet" : false,
            "maxIndexedOrSolutionsReached" : false,
            "maxIndexedAndSolutionsReached" : false,
            "maxScansToExplodeReached" : false,
            "winningPlan" : {
              "stage" : "TRIAL",
              "inputStage" : {
                "stage" : "SHARDING_FILTER",
                "inputStage" : {
                  "stage" : "COLLSCAN",
                  "direction" : "forward"
                }
              }
            },
            "rejectedPlans" : [ ]
          }
        }
      },
        {
        "$sample" : {
          "size" : NumberLong(5)
        }
      },
        {
        "$limit" : NumberLong(5)
      }
      ]
    },
    "sample_testing-rs1" : {
      "host" : "ip-10-122-5-229:20043",
      "stages" : [
        {
        "$cursor" : {
          "queryPlanner" : {
            "namespace" : "test.foo",
            "indexFilterSet" : false,
            "maxIndexedOrSolutionsReached" : false,
            "maxIndexedAndSolutionsReached" : false,
            "maxScansToExplodeReached" : false,
            "winningPlan" : {
              "stage" : "TRIAL",
              "inputStage" : {
                "stage" : "OR",
                "inputStages" : [
                  {
                  "stage" : "QUEUED_DATA"
                },
                  {
                  "stage" : "SHARDING_FILTER",
                  "inputStage" : {
                    "stage" : "MULTI_ITERATOR"
                  }
                }
                ]
              }
            },
            "rejectedPlans" : [ ]
          }
        }
      },
        {
        "$sampleFromRandomCursor" : {
          "size" : NumberLong(5)
        }
      },
        {
        "$limit" : NumberLong(5)
      }
      ]
    }
  },
  "command" : {
    "aggregate" : "foo",
    "pipeline" : [
      {
      "$sample" : {
        "size" : 5
      }
    }
    ],
    "cursor" : {    }
  },
  "ok" : 1,
  "$clusterTime" : {
    "clusterTime" : Timestamp(1663155376, 176),
    "signature" : {
      "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      "keyId" : NumberLong(0)
    }
  },
  "operationTime" : Timestamp(1663155376, 174)
}

Generated at Thu Feb 08 06:11:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.