Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-15802

[SERVER] Add detail to $sample page

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: manual, Server
    • Labels:

      Two things that are probably worth clarifying:

      • The 5% threshold is not configurable. It is thought to be a good approximation of the cutoff value where scanning the entire collection will be faster than that many random I/Os.
      • If we are not under the 5% threshold, it's worth saying that we will do a top-k sort (where k = sample size) by a generated random value. This top-k sort can possibly spill to disk if K documents are larger than 100MB, and so allowDiskUse may need to be used.

            kanchana.sekhar@mongodb.com Kanchana Sekhar
            charlie.swanson@mongodb.com Charlie Swanson
            0 Vote for this issue
            3 Start watching this issue

              1 year, 7 weeks, 1 day ago