-
Type:
Improvement
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
A typical work load for MongoDB (or really any database) is to have:
- one set of threads inserting at one end of the tree, growing the tree as keys get ever larger
- another set of threads reading, mostly from the latest inserted keys
Workgen and wtperf cannot currently do this. They can do the first - this is the default for inserting threads. For the second, workgen and wtperf both have a choice of uniform or pareto distribution, but for pareto, the clustering (hot keys) are at the low end. That is, with pareto, key ids < 100 are much more likely than key ids between 100 and 200, etc.
To get a more interesting workload, we'd like a pareto option that favors the highest keys. Given that we've inserted N keys, the internal pareto function currently returns a number k in the range [0, N), with 0 favored. With the new option turned on, we can use the same pareto function, but instead of using k as the key id, use N - k instead.
This ticket is to implement this in workgen.