Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26001

Insert workload stalled at 96% cache utilization

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Duplicate
    • Affects Version/s: 3.2.9
    • Fix Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Storage 2016-10-31

      Description

      Note: this may be the same underlying issue as SERVER-25974, but some of the metrics appear to be different, and this ticket has a specific simple repro not clearly tied to the customer issue seen on SERVER-25974, so opening as a separate ticket for now until/unless we can demonstrate that they are the same issue.

      The insert repro workload from SERVER-20306, also attached to this ticket as repro-32-insert.sh, gets stuck with cache utilization at about 96%:

      • problems start at A, pretty much complete stuck at B
      • ftdc stalls ("ftdc samples/s") suggest that application threads are sometimes getting stuck for extended periods doing evictions
      • application threads seem to be starved for work to do:
        • "pages walked for eviction" has gone up but "pages seen by eviction walks" has gone down
        • application threads are often finding the queue empty
        • pages evicted by application threads is not high

      I've also attached stack traces captured during the stuck period, although I don't think they give much insight.

        Attachments

        1. david-screenshot.png
          david-screenshot.png
          311 kB
        2. diagnostic.tar
          500 kB
        3. repro-32-insert.sh
          1 kB
        4. stacks1.txt
          168 kB
        5. stacks2.txt
          183 kB
        6. stacks3.txt
          183 kB
        7. stuck.png
          stuck.png
          236 kB

          Issue Links

            Activity

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: