Numerous reports of mongod using excessive memory. This has been traced back to a combination of factors:
- TCMalloc does not free from page heap
- Fragmentation of spans due to varying allocation sizes
- Current cache size limits enforce net of memory use, not including allocator overhead, which is often significantly less than total memory used. This is surprising and difficult for users to tune for appropriately
Issue #1 has a workaround by setting an environment variable (AGGRESSIVE_DECOMMIT), but may have a performance impact. Further investigation ongoing.
Issue #2 has fixes in place in v3.3.5.
Issue #3 will likely be addressed by making the WiredTiger engine aware of memory allocation overhead, and tuning cache usage accordingly. (Need reference to WT ticket)
Regression tests for memory usage are being tracked here:
While loading data into mongo, each of the 3 primaries crashed with memory allocation issues. As data keeps loading, new primaries are elected. Eventually it looks like they come down as well. Some nodes have recovered and have come back up, but new ones keep coming down. Logs and diagnostic attached