[SERVER-25075] Building 2dsphere index uses excessive memory Created: 14/Jul/16 Updated: 14/Dec/17 Resolved: 02/Aug/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | None |
| Fix Version/s: | 3.0.13, 3.2.9, 3.3.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Siyuan Zhou |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-only | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Completed: | |||||||||||||
| Sprint: | Repl 18 (08/05/16) | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Create a collection with 24 million documents {loc: [0, 0]}. Then create an index with db.c.createIndex({loc:"2dsphere"}).
|
| Comments |
| Comment by Githook User [ 08/Sep/16 ] | |||||||||||||||||||||||||
|
Author: {u'username': u'visualzhou', u'name': u'Siyuan Zhou', u'email': u'siyuan.zhou@mongodb.com'}Message: | |||||||||||||||||||||||||
| Comment by Githook User [ 03/Aug/16 ] | |||||||||||||||||||||||||
|
Author: {u'username': u'visualzhou', u'name': u'Siyuan Zhou', u'email': u'siyuan.zhou@mongodb.com'}Message: (cherry picked from commit 2743e906fef318763e753a67967d503b37fcdd07) | |||||||||||||||||||||||||
| Comment by Githook User [ 02/Aug/16 ] | |||||||||||||||||||||||||
|
Author: {u'username': u'visualzhou', u'name': u'Siyuan Zhou', u'email': u'siyuan.zhou@mongodb.com'}Message: | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 28/Jul/16 ] | |||||||||||||||||||||||||
|
I opened | |||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 28/Jul/16 ] | |||||||||||||||||||||||||
|
Thanks, bruce.lucas@mongodb.com for the detailed explanation. I can confirm this issue by printing out the object size and its buffer's size. For a given index document in S2 index version 3, its object size is 15, but the default buffer size is 512 (data) + 4 (Holder's ref count).
Reserving only 15 bytes should fix the problem for S2 index version 3. For earlier versions, a simple copy() should work. Alternatively, we can reserve 40 bytes. If this turns out to be a bigger problem for other types of indexes, I'd suggest tracking the size of a SharedBuffer and adding it to BSONObj::memUsageForSorter(). The offset of 4 bytes of the ref count in SharedBuffer can also be addressed separately if we want. | |||||||||||||||||||||||||
| Comment by Daniel Pasette (Inactive) [ 18/Jul/16 ] | |||||||||||||||||||||||||
|
This issue is actually more general than stated from the looks of it even without the bug pointed out here by Bruce. mongod could require up to 64*100MB of scratch space for index builds during an initial sync (by design). Maybe we should be trimming the size of these buffers based on available resources. | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 18/Jul/16 ] | |||||||||||||||||||||||||
|
The pattern of bytes used indicate that the sorter was filled and then spilled to disk 9 times while processing the 24 M keys, so it took 24 M / 9 keys to fill it. Memory used by the sorter was about 1300 MiB, or about 1300 MiB / (24 M / 9) = 511 bytes per key, which is much larger than the actual key size, but is about the default initial size of a BSONObjBuilder. So it appears that the problem is that S2CellIdToIndexKey constructs a document using a BSONObjBuilder with the default initial buffer size of 512 bytes and then uses that buffer as-is by calling BSONObjBuilder::obj, whereas the sorter only accounts for the BSONObj size, not the actual buffer size, when computing the amount of memory used. Verified that either of the following fixes the problem:
TBD whether there are other indexing code paths with the same issue. |