Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22573

$sample should stream results when possible

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.2.1
    • Fix Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None

      Description

      Please confirm that the `$sample` aggregation operator streams results to the client immediately whenever possible. If not, consider this a request for improvement.

      For example:

      MongoDB Enterprise > db.serverStatus().storageEngine
      { "name" : "wiredTiger", "supportsCommittedReads" : true }
      MongoDB Enterprise > db.serverBuildInfo().version
      3.2.1
      MongoDB Enterprise > db.users.count()
      10000
      MongoDB Enterprise > db.users.aggregate([{$sample:{size: 5}}])
      

      Here the sampling implementation will use a WiredTiger random cursor to obtain each document. I wish to confirm that the first sample document identified is available to the client before the full sample set is computed.

      Please confirm for each of the sampling mechanism implemented in MongoDB 3.2 (with and without a query predicate), and also sharded cluster behavior.

      Use case: MongoDB Compass. We want to permit users to specify large sample sizes in Compass, then progressively display the results to users. Ideally Compass can begin to display an inferred schema after a very small number of documents are received from the server.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: