Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-55028

Improve the auto-splitter policy

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Sharding EMEA 2021-07-26, Sharding EMEA 2021-08-09, Sharding EMEA 2021-08-23, Sharding EMEA 2021-09-06

      Description

      The chunk splitter is currently relying on the splitVector function that can easily suggest to always split at a chunk at (maxChunkSize / 2).

      While the documentation states that a chunk gets partitioned when it reaches the maximum chunk size, the current implementation can force a split simply if the current size is (maxChunkSize / 2 + ε). This results in the max chunk size being actually (maxChunkSize / 2).

      Not only that, some corner cases can produce very large chunk counts relative to document count.

      This ticket has two objectives to make the auto-splitter less aggressive:

      • As a precondition for splitting, wait for a chunk size to get closer to the maximum
      • Consider making bigger chunks, with a size closer to the maxChunkSize set by the user rather than half of it

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              pierlauro.sciarelli Pierlauro Sciarelli
              Reporter:
              pierlauro.sciarelli Pierlauro Sciarelli
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: