Implement partitioning by top-K frequency difference in MaxDiff algorithm

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • Fully Compatible
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      As the experiment result shows, kAreaDiff may not be the best partitioning strategy in some cases, especially when the variance in frequencies is low (an extreme case is when values are unique).

      Currently, our MaxDiff supports partitioning buckets by selecting top-K area difference in kAreaDiff. Similarly, introduce a variant kFreqDiff that selecting top-K frequency difference.

            Assignee:
            Matt Olma
            Reporter:
            Chi-I Huang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: