Enable use of index uniqueness metadata and revisit join performance testing configs

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The join optimizer is capable of using metadata from unique indexes to determine if join fields represent unique data. If so, we can short-circuit during NDV estimation by returning the collection cardinality. This behavior is guarded behind a query knob that is disabled by default. 

      We disabled it by default due to concern that the short-circuiting would obscure the behavior of the underlying NDV estimation algorithm in many of our performance tests, since they make use of unique indexes. We didn't want to lose coverage for that important part of the code, since we cannot assume in the wild that we will always have unique indexes on joining fields (regardless of whether or not the underlying data is unique).

      However, we want to enable this behavior before release, because we truly think it will be beneficial. This ticket is to enable the feature and revisit our testing strategy to ensure we have coverage for both knob on and off. For example, this could look like a variant that runs with this behavior disabled or a new TPC-H task that runs without unique indexes.

            Assignee:
            Unassigned
            Reporter:
            Hana Pearlman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: