[SERVER-26001] Insert workload stalled at 96% cache utilization Created: 07/Sep/16  Updated: 26/Oct/16  Resolved: 17/Oct/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.9
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Bruce Lucas (Inactive) Assignee: David Hows
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File david-screenshot.png     File diagnostic.tar     File repro-32-insert.sh     Text File stacks1.txt     Text File stacks2.txt     Text File stacks3.txt     PNG File stuck.png    
Issue Links:
Duplicate
duplicates SERVER-25974 Application threads stall for extende... Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage 2016-10-31
Participants:

 Description   

Note: this may be the same underlying issue as SERVER-25974, but some of the metrics appear to be different, and this ticket has a specific simple repro not clearly tied to the customer issue seen on SERVER-25974, so opening as a separate ticket for now until/unless we can demonstrate that they are the same issue.

The insert repro workload from SERVER-20306, also attached to this ticket as repro-32-insert.sh, gets stuck with cache utilization at about 96%:

  • problems start at A, pretty much complete stuck at B
  • ftdc stalls ("ftdc samples/s") suggest that application threads are sometimes getting stuck for extended periods doing evictions
  • application threads seem to be starved for work to do:
    • "pages walked for eviction" has gone up but "pages seen by eviction walks" has gone down
    • application threads are often finding the queue empty
    • pages evicted by application threads is not high

I've also attached stack traces captured during the stuck period, although I don't think they give much insight.



 Comments   
Comment by David Hows [ 09/Sep/16 ]

My interpretation of the graph you uploaded is that the behavior is better with the more recent WiredTiger code.

Correct

It appears to me as though loading the first table finishes at around 3:35, where there is a small blip in insert operations. Loading the second table finishes at around 3:53. There is no evidence of dropouts during the run.

Correct

Wheras when run without the most recent version of WiredTiger the test was reliably failing to finish the load phase.

Correct

Comment by Alexander Gorrod [ 08/Sep/16 ]

Thanks david.hows. My interpretation of the graph you uploaded is that the behavior is better with the more recent WiredTiger code. It appears to me as though loading the first table finishes at around 3:35, where there is a small blip in insert operations. Loading the second table finishes at around 3:53. There is no evidence of dropouts during the run. Wheras when run without the most recent version of WiredTiger the test was reliably failing to finish the load phase.

Does that match your reading of the timeseries?

Comment by David Hows [ 08/Sep/16 ]

I've attached a screenshot of a timeseries of a run combining MongoDB 3.2.9 and WT Develop.

Comment by Michael Cahill (Inactive) [ 08/Sep/16 ]

A little more detail here, I think what we're seeing here with 3.2.9 is a special case of what alexander.gorrod saw in WT-2894 where eviction caused stalls in between checkpoints. Because the configuration here disables checkpoints with --syncdelay 0, if it gets into this state it never recovers, or takes an extremely long time.

In particular, this is caused by a bug in the selection of eviction candidates where the eviction server will normally not retry eviction of a page if the transaction state hasn't changed since last time it tried. This led to a busy wait where all active transactions were blocked waiting for eviction, and eviction was blocked waiting for something to change in the transaction state.

This issue is resolved in WiredTiger's develop branch, and we plan to include those changes into 3.2.10.

Comment by David Hows [ 08/Sep/16 ]

I've run this and it reproduced readily for me on 3.2.9 pretty readily; during the first load stage (ping) I saw bulk inserts taking over 60 seconds to run rather than the normal few hundred milliseconds. This occurred at the 9.1GB inserted mark, which is also pretty indicative.

I've also made 3 full runs of this on v3.2 with the tip of WT develop, these did not reproduce the delays at all so I believe the issue is resolved within WT already.

Comment by Bruce Lucas (Inactive) [ 07/Sep/16 ]

That repro doesn't seem to be completely consistent. I've run it a few times with a few variations on size and number of collections, and it only got completely stuck like that twice. The other times there would be significant stalls of maybe a minute or so, but it recovered and finished the run.

Generated at Thu Feb 08 04:10:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.