Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38205

Optimize splitVector for the jumbo-chunk case

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.15, 4.0.7, 4.1.8
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
    • Fully Compatible
    • v4.0, v3.6
    • Sharding 2018-12-31, Sharding 2019-01-14, Sharding 2019-01-28, Sharding 2019-02-11
    • 0

      If a chunk only contains a single shard key (or very few shard keys), it will be marked as jumbo and not be moveable by the balancer. However, the autosplitter will continue to try to split this chunk periodically, even if there's only a single unique key, which would mean that it could never be split. There are several ways we could optimize for this case:

      1. In splitVector, we can do a lookup at the min key and a backward lookup at the max key, and if the key prior to the max key is the same as the min key, then we know the entire chunk consists of a unique key and we can skip having to scan the chunk.
      2. In splitVector, while scanning, if we decide that a key X should be a split key, we can skip to the next unique key rather than scanning through the rest of the documents for X.

            kevin.pulo@mongodb.com Kevin Pulo
            matthew.saltz@mongodb.com Matthew Saltz (Inactive)
            0 Vote for this issue
            11 Start watching this issue