[SERVER-16546] Mongod memory grows until killed with wiredTiger repl set with small oplog Created: 15/Dec/14 Updated: 31/Dec/14 Resolved: 23/Dec/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | 2.8.0-rc2 |
| Fix Version/s: | 2.8.0-rc4 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | David Daly | Assignee: | Alexander Gorrod |
| Resolution: | Done | Votes: | 0 |
| Labels: | 28qa | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
|||||||||||||||||||||
| Issue Links: |
|
|||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||
| Steps To Reproduce: | ycsb load of 400M documents. 32 threads. Mongod started with ops below. It's a single node replication set. Ignore the string shard in the replset name – that was left over from an earlier experiment.
|
|||||||||||||||||||||
| Participants: | ||||||||||||||||||||||
| Description |
|
Running the load phase for ycsb with a single node repl set with small oplog, the memory grows consistently until mongod is killed. Running MCI build for githash e067fff4e3f3079d070ec168f32c24db9a51a944 Mongod on Amazon EC2 Linux AMI. c3.4xlarge server instance. m3.medium instance for ycsb client. Data on SSD. 32 GB of RAM. Uses up to 30 GB before being killed or descheduled. |
| Comments |
| Comment by Mark Benvenuto [ 15/Dec/14 ] | ||||||||||||||||||||||||||||||||||||||||
|
Summary: __evict_server/__evict_has_work needs to run when it owns the oldest __split_oldest_gen. Repro:
When you see high counts in the "awaiting" counters like the following below, you know you have hit the problem:
In my repro, I see the following in the debugger
| ||||||||||||||||||||||||||||||||||||||||
| Comment by David Daly [ 15/Dec/14 ] | ||||||||||||||||||||||||||||||||||||||||
|
Attaching logs and mms graphs. |