• Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.0.0-rc0, 7.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Fully Compatible
    • v7.0
    • QI 2023-04-17

      T-digest's accuracy vs performance can be tuned using the compaction factor, size of the merge buffer and details of the scaling function. We are not planning to expose any of these parameters to the customers but instead will tune them to achieve a general "sweet" spot. However, we might in the end introduce dial knobs for this, if helpful for testing or no definitive sweet spot exists.

      Micro-benchmarks from the initial impl of t-digest (the non-expr tests using 1e6 inputs and expr tests using 100 inputs, both with normal distribution)

       

      ------------------------------------------------------------------------------------------------------
      Benchmark                                                            Time             CPU   Iterations
      ------------------------------------------------------------------------------------------------------
      PercentileAlgoBenchmarkFixture/tdigest_k0_delta1000          627636909 ns    627617190 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_k1_delta1000          682104826 ns    682094139 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_k2_delta500           643034220 ns    643014710 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_k2_delta1000          646875381 ns    646855993 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_k2_delta5000          742438555 ns    742403578 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_k2_delta1000_sorted   167359114 ns    167354307 ns            4
      PercentileAlgoBenchmarkFixture/tdigest_k2_delta1000_batched  638900042 ns    638880589 ns            1
      PercentileAlgoBenchmarkFixture/tdigest_expr_99_100                1684 ns         1684 ns       414402
      PercentileAlgoBenchmarkFixture/tdigest_expr_01_100                1507 ns         1507 ns       464590
      PercentileAlgoBenchmarkFixture/tdigest_expr_01_1000              33418 ns        33416 ns        20924
      PercentileAlgoBenchmarkFixture/sortAndRank_expr_100                632 ns          632 ns      1109422
      PercentileAlgoBenchmarkFixture/sortAndRank_expr_1000             22419 ns        22418 ns        31093

       

      Note: $group with $avg and null group key on a collection with 1e7 small documents, takes ~4500 msec in SBE and ~7200 msec in classic. So even major differences in runtimes of t-digest itself are unlikely to affect the query latency in a significant way. However, for the expressions it might make sense not to use t-digest at all.

            Assignee:
            irina.yatsenko@mongodb.com Irina Yatsenko (Inactive)
            Reporter:
            irina.yatsenko@mongodb.com Irina Yatsenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: