Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13331

GridFS chunks collection should lower chunk size to 255k

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 2.4.10, 2.6.0-rc3
    • Affects Version/s: None
    • Component/s: GridFS
    • None

      Issue Status as of March 31, 2014

      ISSUE SUMMARY

      From major release version 2.6 on, the usePowerOf2Sizes option is enabled by default, rounding up the space allocated for a record to the next power of two. This makes the current default for chunk sizes in GridFS, which is 256 KB, a bad choice. The overhead of _id and foreign key for chunk documents increases the size to just over 256 KB and would therefore cause 512 KB space allocation, with almost half of the space wasted.

      USER IMPACT

      In the 2.4 release cycle, usePowerOf2Sizes is not enabled by default. This only affects users who have manually enabled usePowerOf2Sizes on their GridFS chunks collection.

      SOLUTION

      The fix is to reduce the default chunk size of GridFS documents to 255 KB. This leaves enough space for the extra fields to still only allocate 256 KB of space for each document.

      WORKAROUNDS

      Driver versions designed to be used with the 2.6 release will include this fix client-side.
      Alternatively, disabling usePowerOf2Sizes also prevents the space overhead but can affect space re-use efficiency, especially in situations were documents are frequently deleted and recreated. This can lead to extent fragmentation.

      AFFECTED VERSIONS

      All recent production release versions up to 2.4.9 are affected.

      PATCHES

      The fix is included in the 2.4.10 production release and the 2.6.0-rc3 release candidate, which will evolve into the 2.6.0 production release.

      Original Description

      Now that the server uses power of 2 by default, if the default chunk size for gridfs is 256k we will almost always be throwing away some storage space. This is because if the bindata field of a chunk will occupy 256k (an exact power of 2), then the _id and foreign key reference to the files collection, etc will take up additional space that will cause the document's allocated storage to be rounded up to 512k (the next power of 2). This would be a huge waste considering it would round up every chunk for a given file.

      Instead, if we make the default chunk size 255k then we have an extra 1k to store the _id and other metadata so that when the document is saved we round up to 256k and not 512k upon persisting the document.

            Assignee:
            tyler@10gen.com Tyler Brock
            Reporter:
            tyler@10gen.com Tyler Brock
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: