Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9971

Add non-uniform (pareto) key distribution to test/format

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT11.2.0, 7.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • 8
    • 2023-04-18 Leviosa Not Leviosa

      Currently test/format updates, reads, deletes keys according to a random uniform distribution.  That is, if the number of rows is a million, each of 0 < N < 1000000 has an equal chance.  However, having hot keys, and warm keys can be helpful in stressing the system.  And having different access patterns for reads/writes can be helpful.

      One simple approach would be to use a pareto distribution, which could have 0 be the hottest key, 1 be the second hottest, etc. with 999999 having the smallest chance of being used.  However, it's sometimes better to have the hot key's location (and position relative to other hot keys) somewhat randomly selected.  So perhaps another mode that does something like (for the million case): take 0-999999 and randomize those numbers into a million buckets b[0] through b[9999999], each with a unique value.  Then when choosing a key, pick a pareto number in the 0-1000000 range (chances are it's a small number).  Then for that number p, look in b[p] for the key number to be used.  This has the effect of b[0] being the hottest key, b[1] next, etc. but those keys being spread around.

      So we'd need test/format config values for reads and writes and the various distribution options.  Maybe read_dist=0 means reads use uniform key distribution, read_dist=N (with N positive) uses the pareto value that says "how hot?" or read_dist=-N uses pareto, but with randomized positioning of hot keys.

      See bench/wtperf for an example of pareto for key choice.  Maybe the pareto code belongs in the test/utility library?

      This idea is inspired by a discussion of WT-9870 with keith.smith@mongodb.com and sue.loverso@mongodb.com , we thought that pounding on hot keys might expose bugs we haven't seen.

            Assignee:
            luke.pearson@mongodb.com Luke Pearson
            Reporter:
            donald.anderson@mongodb.com Donald Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: