Improve bloom filter sizing

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • None

      In LSM trees we don't currently take into account duplicate items when sizing bloom filters that are being created during an LSM merge. There are algorithms available that can help estimate the number of duplicate items - it's worth investigating.

      See:
      http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation

              Assignee:
              [DO NOT USE] Backlog - Storage Execution Team
              Reporter:
              Alexander Gorrod
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: