We use a vector to track all duplicate keys inserted on an index. Specifically, this is in the phase when we dump keys from the external sorter into the WT bulk inserter.
Due to the nature of hybrid index builds, we must track these duplicates until we temporarily stop writes and can see all writes to the table.
If a collection has a large number of duplicate key violations, this vector can build up without bound. We can improve this behavior by batching writes to reduce the memory impact.