Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22573

$sample should stream results when possible

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • 3.2.1
    • Aggregation Framework
    • None

    Description

      Please confirm that the `$sample` aggregation operator streams results to the client immediately whenever possible. If not, consider this a request for improvement.

      For example:

      MongoDB Enterprise > db.serverStatus().storageEngine
      { "name" : "wiredTiger", "supportsCommittedReads" : true }
      MongoDB Enterprise > db.serverBuildInfo().version
      3.2.1
      MongoDB Enterprise > db.users.count()
      10000
      MongoDB Enterprise > db.users.aggregate([{$sample:{size: 5}}])
      

      Here the sampling implementation will use a WiredTiger random cursor to obtain each document. I wish to confirm that the first sample document identified is available to the client before the full sample set is computed.

      Please confirm for each of the sampling mechanism implemented in MongoDB 3.2 (with and without a query predicate), and also sharded cluster behavior.

      Use case: MongoDB Compass. We want to permit users to specify large sample sizes in Compass, then progressively display the results to users. Ideally Compass can begin to display an inferred schema after a very small number of documents are received from the server.

      Attachments

        Activity

          People

            charlie.swanson@mongodb.com Charlie Swanson
            matt.kangas Matt Kangas
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: