Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48067

Reduce memory consumption for unique index builds with large numbers of non-unique keys

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 4.2.0
    • 4.2.9, 4.7.0, 4.4.2
    • None
    • None
    • Fully Compatible
    • v4.4, v4.2
    • Execution Team 2020-07-27
    • 0

    Description

      We use a vector to track all duplicate keys inserted on an index. Specifically, this is in the phase when we dump keys from the external sorter into the WT bulk inserter.

      Due to the nature of hybrid index builds, we must track these duplicates until we temporarily stop writes and can see all writes to the table.

      If a collection has a large number of duplicate key violations, this vector can build up without bound. We can improve this behavior by batching writes to reduce the memory impact.

      We should consider using this ticket to also address the conversion of KeyString back to BSONObj to record duplicates and also the memory amplification of creating new vectors to copy key data.

      Attachments

        Activity

          People

            gregory.noma@mongodb.com Gregory Noma
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            24 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: