[SERVER-29219] Test and consider exposing different compression engines for MongoDB users Created: 15/May/17 Updated: 22/Nov/18 Resolved: 20/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alexander Gorrod | Assignee: | Brian Lane |
| Resolution: | Done | Votes: | 0 |
| Labels: | nonnyc, storage-engines | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
The WiredTiger storage engine supports several compression engines that are not exposed via MongoDB. It would be interesting to know whether there is value in exposing additional compression engines - it would also be very valuable to create a test that could be used to measure the relative compression rate vs CPU usage characteristics of different compression engines for a few interesting MongoDB workloads. The particular compression engines that might be interesting are LZ4 and zstandard. A full set of compression libraries supported by the WiredTiger team is here: |
| Comments |
| Comment by Brian Lane [ 18/Sep/18 ] | ||||||||||
|
alexander.gorrod I think your proposed set of test suites looks like a good place to start, and we can always expand on this depending on what the results look like. Thanks. | ||||||||||
| Comment by Alexander Gorrod [ 11/Sep/18 ] | ||||||||||
|
I've been thinking about which workloads would be suitable for making this decision. I think the data sets used in the blog post about compression shortly after WiredTiger integration are probably a good list of data sets to measure compression ratios. The other relevant metric is CPU overhead of compression - we've seen in the past that I'm going to suggest that the following suite of tests be used to decide if there is enough incremental benefit to a different compression scheme to warrant adding it as an option for MongoDB users: Compression ratio tests:Using mongoimport to load a dataset. These tests should be run using zlib, snappy, none, zstd and lz4 compression libraries.
CPU/performance tests:We should run the same set of compression libraries against YCSB phases: load, 100% read, 95% read, 5% update, 100% update, 50% read 50% update. With 5 million 1kb documents. Each workload executes for 20 million operations. brian.lane@mongodb.com and asya Do you think the above set of results would deliver enough information to decide whether to support new compression libraries for MongoDB? | ||||||||||
| Comment by Asya Kamsky [ 19/May/17 ] | ||||||||||
|
We are definitely interested in doing this. Next step will be to scope and schedule this. | ||||||||||
| Comment by Nick Judson [ 15/May/17 ] | ||||||||||
|
Yes please. I've found LZ4 to be the best for my app. |