[SERVER-3018] Compression of wire protocol Created: 29/Apr/11  Updated: 24/Apr/17  Resolved: 09/Aug/16

Status: Closed
Project: Core Server
Component/s: Networking, Security
Affects Version/s: None
Fix Version/s: 3.3.11

Type: New Feature Priority: Major - P3
Reporter: Robert Vanderwall Assignee: Jonathan Reams
Resolution: Done Votes: 77
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9583 Docs for SERVER-3018: Compression of ... Closed
Related
related to SERVER-28442 Add support for lz4 wire protocol com... Closed
related to SERVER-25620 Compression of wire protocol for Clients Closed
related to SERVER-27310 Add support for zlib wire protocol co... Closed
is related to SERVER-524 Encryption of wire protocol with SSL Closed
Backwards Compatibility: Fully Compatible
Sprint: Platforms 2016-08-26
Participants:

 Description   

Edited description:
Implement compression of wire protocol communication independent of running SSL.

==================================
Old description:
Currently, the Mongo Wire protocol sends the data essentially in clear-text. This has two implications for my user scenario. First is that there's a lot of network traffic generated for queries. When reports are run and many fields of data are retrieved, I get the same field name over and over. Some compression here would speed up the delivery of the data. The query itself is lightning fast, but the transaction is slowed down by the movement of the massive amount of data.

Second, the clear-text has security implications. Running SSL or some similar secure wire protocol could solve potentially both these issues.

Thanks!



 Comments   
Comment by Ramon Fernandez Marina [ 14/Aug/16 ]

sallgeud, I did a quick search of the SERVER backlog and couldn't find a ticket about adding compression to clients – care to create one? It will be a significant endeavor to add it to all drivers, but definitely a useful feature to have.

Comment by Chad Kreimendahl [ 14/Aug/16 ]

Is there a plan to do compression to clients at some point?

Comment by Jonathan Reams [ 09/Aug/16 ]

Wire protocol compression using Snappy has been implemented for MongoDB 3.3.11. Compression is off-by-default, but can be enabled in the server with --networkMessageCompressors=snappy. This feature is currently only supported for intra-cluster communication (e.g. mongod to mongod and mongod to mongos). More extensive documentation will be provided in the 3.4 documentation.

Comment by Linar Savion [ 20/Apr/16 ]

This feature is crucial for WAN replication, as it can easily compress the oplog for less replication lag

Comment by Chad Kreimendahl [ 20/Apr/16 ]

I'd love to see some consideration for other algorithms like lz4 (for speed). Some modes of lz4 out perform snappy by 3-10 times. The most recent vuln aside, of love to see it as a compression option for wt, for those who want tons of speed with slightly less compression.

We're getting 8.5:1 compression on snappy, which is amazing... But we'd love to see twice the compression speed (including across the wire), at even 4:1 compression.

Everyone's data compresses differently, so more options are always good. I'm sure some would love even higher ratios, sacrificing CPU or responsiveness for saving space.

Comment by Thomas Holloway [ 20/Apr/16 ]

It's been a while since any reply on this issue. What's the status of getting this implemented?

Comment by Manohar Gudas [ 04/Dec/15 ]

I concur with Tomas Pecholt. Column names eating up lot of storage while resting, memory while processing and network while transiting. Data compression (not only column names but also data) in transit will free up network bandwidth. If one has app in his data center and MongoDB in AWS then to support 10x data volume, he has to beef up network bandwidth and other networking gear as well. It becomes more expensive.

Comment by Taylor Dondich [ 20/Oct/15 ]

Please. With cherry on top. Add compression to the wire.

Comment by Chad Kreimendahl [ 20/Oct/15 ]

Now that you're using snappy in wiredtiger, it may be a a great implementation for the wire protocol as its been built with streaming in mind. I'm curious how much this could help replicasets recover from large oplogs faster.

It would be great that if implemented, the compression is not only between server and client, but also between servers in replicasets and shards.

Comment by Tim Shelton [ 24/Apr/15 ]

I appreciate the clarification Andreas, as you can see, the community is watching closely and it is imperative that some of the basics of functionality are covered (including wire compression).

Please prioritize this, as this is a vital requirement for competing in the "big data" market, especially when many of your customers are still stuck on 1gbps switches.

Tim

Comment by Taylor Dondich [ 24/Apr/15 ]

Thank you for re-opening. We at MaxCDN are more than happy to review any approaches to this ticket your team would like to take and test for you with our very large ingest volume.

Comment by Andreas Nilsson [ 24/Apr/15 ]

Thanks for your swift responses. I will re-open this ticket and clarify the description.

I closed it based on the description of the ticket as compression via OpenSSL. I will edit the description appropriately to reflect generic compression of the wire protocol.

It is not my opinion, neither the opinion of MongoDB that compression via OpenSSL is a sufficient substitute to independent transport compression.

Kind regards,
Andreas Nilsson

Comment by Taylor Dondich [ 24/Apr/15 ]

I absolutely agree with Tim Shelton. This is a horrendous answer. Really? So you're saying we need to custom compile OpenSSL and have no control over the compression level? It's this kind of response that makes me lose faith in MongoDB and look elsewhere for more efficient solutions.

Comment by Tim Shelton [ 24/Apr/15 ]

I feel like you are skirting responsibility for the solution.

This is unacceptable and I feel you should re-review this with your leadership. Compression should not be dependent upon SSL. That is a completely absurd requirement. Especially since Redhat removed OpenSSL compression.

Are you serving the larger market?

You should know that MOST legacy mongo installation do not have ssl on by default, and many are based upon redhat.

I am also a vendor and I would never avoid something so imperative as this.

Comment by Andreas Nilsson [ 24/Apr/15 ]

Mongo servers support SSL via OpenSSL. OpenSSL uses compression by default as long as the library is compiled with zlib or other compression support.

After POODLE/CRIME etc. default distributions of OpenSSL are compiled without compression support. For instance RedHat disabled compression in OpenSSL by default in 2013 https://rhn.redhat.com/errata/RHSA-2013-0587.html

If you want to use wire compression with MongoDB please make sure that both the client and server side is compiled with compression. I will close this ticket as "Works as Designed".

Regards,
Andreas Nilsson

Comment by pelit mamani [ 15/Jun/14 ]

I second that. Aggregated non-normalized documents are common practice with mongodb, so it's reasonable to expect the wire protocol to handle them efficiently.
We've used various workarounds, mainly projection to extract smaller documents, and distributed cache. Unfortunately that's expensive code to maintain and "micro manage" - a distraction from the focus of business logic and customer experience (obviously here are cases where this can't be avoided, but a lot of it can be saved by simple compression).

Comment by Taylor Dondich [ 13/Mar/14 ]

We have a LOT of data attempting to be replicated to cross data-center replication to secondaries. Compression would greatly solve our network utilization. At this time, not compressing this data is causing us to not be able to effectively implement cross data-center replication for data redundancy.

Comment by Ruben Caro [ 08/Jan/14 ]

Network traffic volume itself can be a showstopper. It is for me on some projects.

Comment by Tomas Pecholt [ 22/Feb/12 ]

My 2$: While this would be a big improvement over current state, long column names still occupy a big part of memory and hard disk space. Ammount of memory is essential for mongodb performance and even hard disc space is not for free so i suggest to use column name compression on the server. Automatic decompression would be performed by client when needed. Because of this issue we currently assign one-letter names to all columns. That works but it doesn't look very descriptive when somebody uses rockmongo or similar sw and tries to view/modify db contents.

Generated at Thu Feb 08 03:01:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.