To reproduce create a collection with a large number of documents (and no secondary indexes) and then reIndex. This also affects initial sync where a bulk build of the _id index is done. The index build uses >1 GB of memory, whereas it should only use 100 MB.
- during the reindex operation from A-B we see >1 GB of memory allocated outside the WT cache.
- the bulk of that is accounted for by stack239:
It appears that memory is being allocated here in BtreeKeyGeneratorV1::getKeysImpl by using a BSONObjBuilder with a default initial size and never reducing the size, resulting in more memory being used than is accounted for by the sorter, which takes into account only the object size, not the allocated buffer size, similar to
TBD whether this affects other indexes besides _id - so far in my testing it does not seem to, unclear why.