Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26001

Insert workload stalled at 96% cache utilization

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 3.2.9
    • Component/s: WiredTiger
    • Labels:
      None
    • Fully Compatible
    • Storage 2016-10-31

      Note: this may be the same underlying issue as SERVER-25974, but some of the metrics appear to be different, and this ticket has a specific simple repro not clearly tied to the customer issue seen on SERVER-25974, so opening as a separate ticket for now until/unless we can demonstrate that they are the same issue.

      The insert repro workload from SERVER-20306, also attached to this ticket as repro-32-insert.sh, gets stuck with cache utilization at about 96%:

      • problems start at A, pretty much complete stuck at B
      • ftdc stalls ("ftdc samples/s") suggest that application threads are sometimes getting stuck for extended periods doing evictions
      • application threads seem to be starved for work to do:
        • "pages walked for eviction" has gone up but "pages seen by eviction walks" has gone down
        • application threads are often finding the queue empty
        • pages evicted by application threads is not high

      I've also attached stack traces captured during the stuck period, although I don't think they give much insight.

        1. david-screenshot.png
          david-screenshot.png
          311 kB
        2. diagnostic.tar
          500 kB
        3. repro-32-insert.sh
          1 kB
        4. stacks1.txt
          168 kB
        5. stacks2.txt
          183 kB
        6. stacks3.txt
          183 kB
        7. stuck.png
          stuck.png
          236 kB

            Assignee:
            david.hows David Hows
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            2 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: