Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-2724

splitVector counting off-by-one when calculating split points

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      The splitVector function, in order to split a chunk, calculates the number of elements N which probably have a size in bytes of about 50% of the max chunk size. The function then counts off split points every *N + 1* keys to determine split points. This leads to additional elements included in each split chunk, which, at worst, increases the chunk size to 75% of the maximum chunk size (since the maximum element size is 16MB, and the maximum chunk size is 64MB).

      Though the chunk boundaries are slightly skewed by this now, all chunk splits that normally would work should still work, again because of the 64MB vs 16MB ratio). If our document size ever increases however, (past 21 MB) this will not be the case, and for example, chunks larger than the max chunk size can be created. Cats and dogs living together, mass hysteria, etc.

      The biggest effect of this so far is the maximum size of a collection to be sharded depends on the size of the elements inside. With max element size, 384GB collections can be sharded.

            Assignee:
            Unassigned Unassigned
            Reporter:
            greg_10gen Greg Studer
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: