Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38205

Optimize splitVector for the jumbo-chunk case



    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6.15, 4.0.7, 4.1.8
    • Component/s: Sharding
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v4.0, v3.6
    • Sprint:
      Sharding 2018-12-31, Sharding 2019-01-14, Sharding 2019-01-28, Sharding 2019-02-11
    • Case:
    • Linked BF Score:


      If a chunk only contains a single shard key (or very few shard keys), it will be marked as jumbo and not be moveable by the balancer. However, the autosplitter will continue to try to split this chunk periodically, even if there's only a single unique key, which would mean that it could never be split. There are several ways we could optimize for this case:

      1. In splitVector, we can do a lookup at the min key and a backward lookup at the max key, and if the key prior to the max key is the same as the min key, then we know the entire chunk consists of a unique key and we can skip having to scan the chunk.
      2. In splitVector, while scanning, if we decide that a key X should be a split key, we can skip to the next unique key rather than scanning through the rest of the documents for X.


          Issue Links



              • Votes:
                0 Vote for this issue
                10 Start watching this issue


                • Created: