Simple strategy to estimate unique RID count for index based CE (without row counts)

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      We will need to see if there is any way to access the number of keys in an index, or compute it, without WT-7408. Possibly SPM-4163 will deliver this? If not, implement a simple strategy to estimate average index keys per document. For example, we could read index keys from the start cursor until we see a duplicate RID, and use this to guess the fraction of duplicate RIDs in the range.

       

      This should include accuracy tests demonstrating the pros and cons of whatever choice we make. For example, choosing to rely on average index keys per document can produce poor results when most documents have many index keys, and only a small number have a few

            Assignee:
            Unassigned
            Reporter:
            Hana Pearlman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: