[SERVER-20174] WT snapshot threads significantly impact performance Created: 28/Aug/15 Updated: 06/Dec/22 Resolved: 28/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Martin Bligh | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Under heavy insert load on a 2-node replica set, WT eviction appears to hang on the secondary. Per Michael, this seems to be related to a snapshot pinning down the memory
|
| Comments |
| Comment by Eric Milkie [ 28/Sep/17 ] | ||
|
Snapshot threads are being removed by | ||
| Comment by Martin Bligh [ 26/Oct/15 ] | ||
|
I'll attach a compounded script "benchSet" - if you run it for longer than about 10s without disabling snapshots, it hangs. | ||
| Comment by Michael Cahill (Inactive) [ 15/Oct/15 ] | ||
|
keith.bostic, it would be good to understand what is causing those slow operations on the primary as well. I'm hoping that this is no longer reproducing because snapshots are no longer keeping transaction IDs pinned... | ||
| Comment by Keith Bostic (Inactive) [ 14/Oct/15 ] | ||
|
martin.bligh, I got back to this one today; I did my testing with an AWS c3.4xlarge system, so 16 vCPU. I continue to see dropouts on the primary (about 200 out of a total of 6000 reporting periods), I'm only seeing two dropouts on the secondary in that same period. As I understand it, the concern with this ticket was the secondary dropouts, which I'm no longer seeing. Can you verify if you're still seeing dropouts on the secondary? michael.cahill, should we be looking at dropouts on the primary, too, or are they currently expected in this test? | ||
| Comment by Keith Bostic (Inactive) [ 17/Sep/15 ] | ||
|
martin.bligh, just to let you know, using the wiredtiger develop branch, I ran this job to completion – lots of stalls and periods of low inserts, but it never stopped. I'm out of pocket for the next couple of days, but I'll dig in deeper and understand the load in more detail soon. | ||
| Comment by Michael Cahill (Inactive) [ 01/Sep/15 ] | ||
|
BTW, keith.bostic, this is a case where the lookaside changes will be exercised, so it would be good to see what happens with MongoDB master + WT develop, once develop is stable. | ||
| Comment by Martin Bligh [ 31/Aug/15 ] | ||
|
Yup, I think that workaround avoids the issue. | ||
| Comment by Michael Cahill (Inactive) [ 31/Aug/15 ] | ||
|
keith.bostic, I have scripts from martin.bligh to run this workload – I'll attach them here. Can you please take a look? martin.bligh, did you try dan@10gen.com's suggestion of running mongod --setParameter=enableReplSnapshotThread=false? It would be good to know whether that worked around the issue. In one window:
And in another:
|