Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-50959

Avoid copying data from the Sorter into InMemIterator

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Storage
    • Labels:
    • Storage Execution

      SERVER-49829 had already introduced this optimization, but it is being undone as a part of SERVER-50920 in order to fix an issue with resumable index builds. However, it should be possible to re-add this optimization with some additional steps taken for the resumable index build case. There are a few options:

      1. In the case that the index build did not need to spill to disk, have the index build's BulkBuilder keep track of the keys that it has already retrieved from the InMemIterator. Then, if it is interrupted for shutdown during bulk load, it can supply this (sorted) list of keys to Sorter to supplement the rest of the sorted keys that are still in the Sorter, which will be written to disk as is already done.
      2. Have index builds always spill to disk at the beginning of the bulk load phase. This has the downside of spilling to disk even when we otherwise do not need to.
      3. Use an iterator in the InMemIterator instead of popping from the front for each element. This has the downside of still requiring the data to be copied when returning it from the InMemIterator.

      Resumable index builds will always need to copy the data at one point or another, but options 1 and 2 allow other users of the Sorter to not have to do these otherwise unnecessary copies.

            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            gregory.noma@mongodb.com Gregory Noma
            0 Vote for this issue
            2 Start watching this issue