[SERVER-32595] LZ4 compression in MongoDB Created: 09/Jan/18  Updated: 09/Jan/19  Resolved: 09/Jan/19

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Takeshi Yoshimura Assignee: Brian Lane
Resolution: Won't Fix Votes: 2
Labels: storage-engines
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31555 support zstd for network compression Closed
related to SERVER-17741 LZ4 compressor for mongod Closed
related to WT-1751 Add LZ4 compression to WiredTiger sup... Closed
Participants:

 Description   

WiredTiger added LZ4 support more than two years ago, but MongoDB does not enable it. Why not? LZ4 has better compression ratio and speed than snappy.

I confirmed my experimental patch for lz4 works on YCSB benchmarks as well as other compression codecs. In the experiments, the compression ratio of storage size is reduced from 28% to 15% if I change journal and block compressors in WiredTiger from snappy to lz4. I used a YCSB in https://github.com/mongodb-labs/YCSB with compressibility=10.



 Comments   
Comment by Brian Lane [ 09/Jan/19 ]

Hi t.yoshimura,

For 4.2 we plan to expose Zstandard as a compressor option in mongod. The code changes are already present in master on Github if you want to pull the source and experiment with it. While WT does support LZ4, we have decided to expose Zstandard instead of LZ4 for now.

In my tests with different datasets, I did see LZ4 was the fastest, but Zstandard was able to achieve better compression ratios with minimal CPU overhead. Different workloads could expose different results, but Zstandard did give better compression and was not much slower than LZ4.

I will close this issue as won't fix but invite you to take a look at Zstandard and let me know how your results are using it compared to your patch.

Cheers!

-Brian

Comment by Ramon Fernandez Marina [ 13/Jan/18 ]

Thanks for your report t.yoshimura. We've sent this request to the Storage team for consideration, and we'll post updates on this ticket as they happen.

Regards,
Ramón.

Comment by Michael Cahill (Inactive) [ 09/Jan/18 ]

LZ4 (or ZStandard) might also be preferable to snappy (or zlib) for some users if it has comparable compression with less CPU. Both the newer compressors have a larger configuration space for trading off CPU for compression ratio, so it's reasonable to revisit.

Generated at Thu Feb 08 04:30:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.