Simple strategy to estimate unique RID count for index based CE (without row counts)

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We will need to see if there is any way to access the number of keys in an index, or compute it, without WT-7408. Possibly SPM-4163 will deliver this? If not, implement a simple strategy to estimate average index keys per document. For example, we could read index keys from the start cursor until we see a duplicate RID, and use this to guess the fraction of duplicate RIDs in the range.

       

      This should include accuracy tests demonstrating the pros and cons of whatever choice we make. For example, choosing to rely on average index keys per document can produce poor results when most documents have many index keys, and only a small number have a few

              Assignee:
              Hana Pearlman
              Reporter:
              Hana Pearlman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: