Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48067

Reduce memory consumption for unique index builds with large numbers of non-unique keys

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.2.0
    • Fix Version/s: 4.2.9, 4.7.0, 4.4.2
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v4.4, v4.2
    • Sprint:
      Execution Team 2020-07-27
    • Case:
    • Linked BF Score:
      0

      Description

      We use a vector to track all duplicate keys inserted on an index. Specifically, this is in the phase when we dump keys from the external sorter into the WT bulk inserter.

      Due to the nature of hybrid index builds, we must track these duplicates until we temporarily stop writes and can see all writes to the table.

      If a collection has a large number of duplicate key violations, this vector can build up without bound. We can improve this behavior by batching writes to reduce the memory impact.

      We should consider using this ticket to also address the conversion of KeyString back to BSONObj to record duplicates and also the memory amplification of creating new vectors to copy key data.

        Attachments

          Activity

            People

            Assignee:
            gregory.noma Gregory Noma
            Reporter:
            louis.williams Louis Williams
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            24 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: