Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.2.9, 4.7.0, 4.4.2
Affects Version/s: 4.2.0
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Backport Requested:

v4.4, v4.2
Sprint:
Execution Team 2020-07-27
Case:
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We use a vector to track all duplicate keys inserted on an index. Specifically, this is in the phase when we dump keys from the external sorter into the WT bulk inserter.

Due to the nature of hybrid index builds, we must track these duplicates until we temporarily stop writes and can see all writes to the table.

If a collection has a large number of duplicate key violations, this vector can build up without bound. We can improve this behavior by batching writes to reduce the memory impact.

We should consider using this ticket to also address the conversion of KeyString back to BSONObj to record duplicates and also the memory amplification of creating new vectors to copy key data.

Assignee:: Gregory Noma
Reporter:: Louis Williams
Participants:: Asya Kamsky, Githook User, Gregory Noma, Louis Williams
Votes:: 0 Vote for this issue
Watchers:: 24 Start watching this issue

Created:: May 08 2020 09:41:37 PM UTC
Updated:: Oct 29 2023 10:08:27 PM UTC
Resolved:: Jul 20 2020 07:37:10 PM UTC
Confidence Status Last Update:: 14/Jul/20 5:19 PM

Details

Description

Attachments

Forms

Activity

People

Dates