[SERVER-6161] BSONObjExternalSorter can consume 16TB of heap space Created: 21/Jun/12  Updated: 15/Aug/12  Resolved: 21/Jun/12

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Internal Code
Affects Version/s: 2.0.6, 2.1.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andy Schwerin Assignee: Andy Schwerin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Affects all 2.0 and 2.1 releases, maybe 1.8.


Operating System: ALL
Participants:

 Description   

The BSONObjExternalSorter performs in-memory sorts of sets of 1E6 BSON documents, writes the results to files, and then merges the files iteratively. Since a BSON document can be up to 16MB, and since BSONObjExternalSorter::add() stores a heap-allocated BSON object (rather than a datafile backed BSON object), an external sort can occupy up to 16 TB of heap space, leading to OOM killing.

Two possible fixes are (1) use the version of the BSON obj in the data files, and (2) limit the number of bytes we copy to the heap, instead of the number of objects.

Option 1 has the advantage of copy minimization, allowing us to sort more data in memory (because the sort representation consists of small objects) and reducing heap fragmentation.

Option 2 has the advantage of being a smaller source code change.

Before repairing, I would like to see a (standalone?) repro of this being a problem.



 Comments   
Comment by Andy Schwerin [ 21/Jun/12 ]

Further down in BSONObjExternalSorter::add is the code that limits the number of elements that will be put into a single file, and hence held in RAM. Still have the heap-copying cost, but the heap size impact shouldn't exceed the high tens of megabytes.

Generated at Thu Feb 08 03:10:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.