Improve bloom filter sizing

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • None

      In LSM trees we don't currently take into account duplicate items when sizing bloom filters that are being created during an LSM merge. There are algorithms available that can help estimate the number of duplicate items - it's worth investigating.

      See:
      http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation

            Assignee:
            [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: