Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27262

Wiredtiger cache usage is higher than normal status, so eviction thread never sleep

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Gone away
    • Affects Version/s: 3.2.9
    • Fix Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • Operating System:
      ALL

      Description

      in Two shards mongodb cluster, one primary's wiredtiger cache usage is staying about 90%.
      After examinging stack trace, eviction thread never sleep and consume 1 cpu core all the time.

      # top
      top - 14:47:31 up 87 days,  3:19,  1 user,  load average: 1.23, 1.26, 1.22
      Tasks: 683 total,   1 running, 682 sleeping,   0 stopped,   0 zombie
      %Cpu0  :  1.0 us,  1.0 sy,  0.0 ni, 98.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
      ...
      %Cpu10 :  1.0 us,  0.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
      %Cpu11 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st <== 
      %Cpu12 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
      ...
      

      We query a lot of data selecting query(about 27000 docs x 30 times) on both primary member.
      One primary is okay, but the other is not good and still consuming 1 cpu core for evicitng pages.

      Looks like cache usage is not dropped stable status (like 80~85%), so eviction thread never stop scanning pages. I don't know why cache usage is never drop to stable status.
      Wiredtiger status report they read-in a lot of block to wired tiger cache (270MB/10 sec).
      But weird thing is that There's no disk read and no major fault and not so many minor fault on both primary server. all system metric (except cpu) is almost same as the other primary(stable one).

      According to stacktrace, one thread is doing "__tree_walk_internal()", acutally 2 threads and they are consuming 1 cpu core by turns.

        Attachments

        1. current-op.json
          8 kB
        2. diagnostics-metrics.tar.gz
          27.12 MB
        3. iostat.txt
          16 kB
        4. mongostat.txt
          10 kB
        5. pagefault-sarB.txt
          8 kB
        6. stack.tar.gz
          65 kB
        7. vmstat.txt
          6 kB
        8. wiredtiger-cache-metrics-1min-delta.xlsx
          13 kB
        9. wiredtiger-cacheusage.png
          wiredtiger-cacheusage.png
          22 kB

          Activity

            People

            Assignee:
            kelsey.schubert Kelsey T Schubert
            Reporter:
            matt.lee Matt SeongUck Lee
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: