[SERVER-44740] huge oplog configuration causes memory use to grow without bound Created: 19/Nov/19 Updated: 10/Apr/20 Resolved: 09/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Keith Bostic (Inactive) | Assignee: | Rachelle Palmer |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Execution Team 2019-12-16 | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
In a customer case configuring the oplog to 3TB resulted in memory usage growing over time, apparently without bound. Reducing the oplog size resolved the customer issue, but doing some testing around huge oplog configurations seems warranted. EDIT: |
| Comments |
| Comment by Bruce Lucas (Inactive) [ 20/Dec/19 ] |
|
So yeah, no sign of the leak in this repro - "allocated minus wt cache" remains steady. Maybe smaller entries and/or 2-node repl set for another try? |
| Comment by Eric Milkie [ 20/Dec/19 ] |
|
diag.zip |
| Comment by Eric Milkie [ 16/Dec/19 ] |
|
Thanks Bruce. This was indeed compiled locally from the 3.4.23 tag. I ran a single node replica set, which I figured would have the same behavior in the oplog, except for the read load (standalone would not have an oplog at all). |
| Comment by Bruce Lucas (Inactive) [ 16/Dec/19 ] |
|
Thanks Eric. There does seem to be a very slight increase in allocated minus cache; will be interested to see results after running for a few days. Can you attach latest ftdc data? I spotted a couple of differences between this and the customer issue, significance unknown:
|
| Comment by Eric Milkie [ 13/Dec/19 ] |
|
diag.zip |
| Comment by Eric Milkie [ 12/Dec/19 ] |
|
That's a good idea, I'll rerun and collect that for you. |
| Comment by Bruce Lucas (Inactive) [ 12/Dec/19 ] |
|
milkie, do you have the ftdc data from this run? I'd be interested in comparing it with the data from the customer that hit this issue. |
| Comment by Eric Milkie [ 12/Dec/19 ] |
|
I set up a 3.4.23 server and ran it with a 3 GB oplog, then set up several shell workloads to fill up the oplog. I ran it for a couple days and ran the VTune memory allocation analyzer on it. Unfortunately, I was unable to reproduce any heap memory growth. |
| Comment by Bruce Lucas (Inactive) [ 20/Nov/19 ] |
|
Is this a duplicate of |