Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor - P4
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0-rc0
    • Component/s: Storage
    • Labels:
      None

      Description

      Compression is now supported by the new WiredTiger storage engine.
      Snappy disk compression is on by default and there are more options configurable.
      See SERVER-15953 for more information.

      WAS:
      When storing textual data (and having more CPU than IO capacity) it'd be nice to have an option to have the data stored gzip compressed on disk.

        Issue Links

          Activity

          Hide
          jblackburn James Blackburn added a comment -

          We've got a large MongoDB instance running on top of ZFS on linux using lz4 compression. It seems to work well - at least we haven't had any problems related to the filesystem.

          Show
          jblackburn James Blackburn added a comment - We've got a large MongoDB instance running on top of ZFS on linux using lz4 compression. It seems to work well - at least we haven't had any problems related to the filesystem.
          Hide
          chengas123 Ben McCann added a comment -

          James, did you benchmark performance with compressed ZFS at all?

          Show
          chengas123 Ben McCann added a comment - James, did you benchmark performance with compressed ZFS at all?
          Hide
          jblackburn James Blackburn added a comment - - edited

          We've done some benchmarking, yes.

          We're running zfs on linux 0.6.2 on RHEL6 and this setup is very new here. Throughput, with an I/O bound workload, is near-on identical to a replicaset backed by ext4. Though this should be taken with a pinch of salt as:

          1. the application we're running is fairly data intensive, so we already do in-app compression with lz4
          2. The databases are very new and the traffic is mostly write only

          1) Gives gives us end-to-end I/O saving including network traffic and memory load on the mongodb servers. With 2 it's not clear how, or whether, performance will degrade as the DB ages and ZFS's COW nature causes leads to fragmentation. Given MongoDB's nature is essentially random read I/O anyway, I'm hoping it won't be too bad, but time will tell.

          As we already do compression in the app, ZFS gives us a compression factor of only ~1.1x on these MongoDB databases. For normal databases (e.g. the configdb) and home directories we get a 2x - 10x compression factor.

          Edit: although the setup is new, we've put >8TB of data into it, and soak tested full I/O bound reads for a few days with nothing blowing up.

          Show
          jblackburn James Blackburn added a comment - - edited We've done some benchmarking, yes. We're running zfs on linux 0.6.2 on RHEL6 and this setup is very new here. Throughput, with an I/O bound workload, is near-on identical to a replicaset backed by ext4. Though this should be taken with a pinch of salt as: the application we're running is fairly data intensive, so we already do in-app compression with lz4 The databases are very new and the traffic is mostly write only 1) Gives gives us end-to-end I/O saving including network traffic and memory load on the mongodb servers. With 2 it's not clear how, or whether, performance will degrade as the DB ages and ZFS's COW nature causes leads to fragmentation. Given MongoDB's nature is essentially random read I/O anyway, I'm hoping it won't be too bad, but time will tell. As we already do compression in the app, ZFS gives us a compression factor of only ~1.1x on these MongoDB databases. For normal databases (e.g. the configdb) and home directories we get a 2x - 10x compression factor. Edit : although the setup is new, we've put >8TB of data into it, and soak tested full I/O bound reads for a few days with nothing blowing up.
          Hide
          chengas123 Ben McCann added a comment -

          Thanks for sharing! It's great to see numbers around it. To share some of our benchmarking, we've been testing TokuMX and have found that we get a 5x compression factor and 3x write throughput, which is a pretty awesome win.

          Show
          chengas123 Ben McCann added a comment - Thanks for sharing! It's great to see numbers around it. To share some of our benchmarking, we've been testing TokuMX and have found that we get a 5x compression factor and 3x write throughput, which is a pretty awesome win.
          Hide
          kaga agahd added a comment -

          Using TokuMX 1.4 we get 10x compression factor and 5x write throughput.

          Show
          kaga agahd added a comment - Using TokuMX 1.4 we get 10x compression factor and 5x write throughput.

            Dates

            • Created:
              Updated:
              Resolved:
              Days since reply:
              1 year, 3 weeks, 1 day ago
              Date of 1st Reply: