[SERVER-26277] allow to have compressed data in RAM Created: 23/Sep/16  Updated: 06/Dec/22  Resolved: 30/Sep/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Trivial - P5
Reporter: marian badinka Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Participants:

 Description   

Current situation: Having compression enabled - data size = 5 GB, storage size = 1GB
But when data is loaded to RAM, it consumes 5 GB in RAM.

Proposal:
What if the data /working set/ in RAM would be compressed, same as on disk ? So working set in RAM would be 1 GB and not 5GB

Considering a lot of CPUs, the online decompression overhead would be minor.

This would not be of course suitable for super-fast data, but majority of the apps in my company is midsize,low-speed apps. With memory-compression switch we could avoid many situations when working set is above cache and pages must be exchanged with disk and locking the db



 Comments   
Comment by Alexander Gorrod [ 30/Sep/16 ]

We would set up WT cache for "modest size" 1-2 GB only. So the imported data would be actively flushed from cache to disk. But what about queries ? They will most probably will not have space in RAM and every query will trigger the load of pages from disk to cache. ? Or the rest of RAM 6-7 GB would be also used to keep the working set in RAM ?

I believe your question here is whether reads of recently written data come from filesystem cache, or need to read from disk. My understanding is that they will be read from filesystem cache, but you should definitely test your particular configuration.

would you consider introduce feature to lock-down the specified collection data/"query result" in RAM?. F.e. "login_name" or "session" collection would be nice to have always in RAM despite there will be heavy write/read request causing to empty/exchange the whole WT-Cache ?

We do have an open feature request to be able to give certain tables priority in cache: https://jira.mongodb.org/browse/WT-2452
At the moment we rely on the cache eviction least-recently-used algorithm to choose the best pages without hints. Once the functionality is added to WiredTiger it will need to be exposed via MongoDB as well. It's on our todo list, and I recommend you add a comment stating your use case in that ticket, to increase the likelihood that the work gets scheduled.

Comment by marian badinka [ 29/Sep/16 ]

Alex,

Thanks a lot for your update.

What about another case. RAM = f.e. 8 GB

We have application requiring heavy write and middle read.

We would set up WT cache for "modest size" 1-2 GB only. So the imported data would be actively flushed from cache to disk. But what about queries ? They will most probably will not have space in RAM and every query will trigger the load of pages from disk to cache. ? Or the rest of RAM 6-7 GB would be also used to keep the working set in RAM ?

Another related question...would you consider introduce feature to lock-down the specified collection data/"query result" in RAM?. F.e. "login_name" or "session" collection would be nice to have always in RAM despite there will be heavy write/read request causing to empty/exchange the whole WT-Cache ?
To achieve this above we now running 2 mongod in one server, where 1st holds the sensitive-always-to-be-in-RAM data and 2nd mongod hold rest of the data.

Anyway thanks for update and this can be closed.

Regards \ Marian

Comment by Alexander Gorrod [ 29/Sep/16 ]

marian.badinka@dhl.com thanks for the suggestion, it is something we have considered in the past. The functionality is not high on our priority list because we have observed that the operating system disk cache can provide most of the benefits a solution implemented inside MongoDB would.

In cases where a data set fits in RAM when compressed, but does not when uncompressed often the best performance can be achieved by:

  • Ensuring the operating system disk buffer is enabled. On Linux the disk buffer is called the buffer cache: http://www.tldp.org/LDP/sag/html/buffer-cache.html
  • Configuring the WiredTiger cache size to be a relatively modest size (in your example I would suggest 1GB). Ideally the WiredTiger cache is still large enough to hold the internal structure and indexes.
  • Monitor I/O statistics to ensure that read operations are not requiring reads from disk.

What happens is that WiredTiger will issue operating system read operations for most read operations, but those operations are fast, since they are serviced from memory by the operating system.

Comment by Kelsey Schubert [ 23/Sep/16 ]

Hi marian.badinka@dhl.com,

Thank you for the improvement request. We have marked this ticket to be considered by the WiredTiger team and we will update the ticket after they have had an opportunity to discuss it.

Kind regards,
Thomas

Comment by marian badinka [ 23/Sep/16 ]

not CS ticket please, just normal one

Generated at Thu Feb 08 04:11:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.