Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16997

wiredtiger's "bytes currently in the cache" exceeds "maximum bytes configured" and eviction threads are deadlocked

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.0.0-rc8
    • Affects Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Fully Compatible
    • ALL
    • 0

      not sure how to reproduce, it happened on dan's test server after running a heavy workload.

      dan:PRIMARY> db.serverStatus().wiredTiger.cache
      {
      	"tracked dirty bytes in the cache" : 108113698,
      	"bytes currently in the cache" : NumberLong("2305843128198539"),
      	"maximum bytes configured" : 2147483648,
      	"bytes read into cache" : 74306876990,
      	"bytes written from cache" : 55231308853,
      	"pages evicted by application threads" : 100627896064,
      	"checkpoint blocked page eviction" : 1930,
      	"unmodified pages evicted" : 812357,
      	"page split during eviction deepened the tree" : 4,
      	"modified pages evicted" : 1753265,
      	"pages selected for eviction unable to be evicted" : 1651488,
      	"pages evicted because they exceeded the in-memory maximum" : 1619,
      	"pages evicted because they had chains of deleted items" : 253949,
      	"failed eviction of pages that exceeded the in-memory maximum" : 3363,
      	"hazard pointer blocked page eviction" : 3565,
      	"internal pages evicted" : 385,
      	"maximum page size at eviction" : 147633879,
      	"eviction server candidate queue empty when topping up" : 28707,
      	"eviction server candidate queue not empty when topping up" : 255279,
      	"eviction server evicting pages" : 185,
      	"eviction server populating queue, but not evicting pages" : 283801,
      	"eviction server unable to reach eviction goal" : 1,
      	"pages split during eviction" : 6618,
      	"pages walked for eviction" : 159135276,
      	"in-memory page splits" : 545,
      	"tracked dirty pages in the cache" : 588,
      	"pages currently held in the cache" : 1145,
      	"pages read into cache" : 2401943,
      	"pages written from cache" : 1840049
      }
      

      monogstat shows that the cache is totally full of dirty bytes:

        2723  7960   5648     *0       0     0|0    72.8  100.0       0  3.4G 2.8G   0|0  8|84   16m    14m  103 dan  PRI 13:28:10
        3779 10952   7308     *0       0     1|0    89.3   99.9       0  3.4G 2.8G   0|0 23|66   21m    19m  103 dan  PRI 13:28:12
      insert query update delete getmore command % dirty % used flushes vsize  res qr|qw ar|aw netIn netOut conn set repl     time
        1787  5054   3349     *0       0     0|0   100.1  100.9       0  3.4G 2.8G   0|0 28|72   10m     9m  103 dan  PRI 13:28:14
          *0    *0     *0     *0       0     1|0   100.1  100.9       0  3.4G 2.8G   0|0 27|73  160b     9k  103 dan  PRI 13:28:16
          *0    *0     *0     *0       0     0|0   100.1  100.9       0  3.4G 2.8G   0|0 27|73    1k     9k  103 dan  PRI 13:28:18
          *0    *0     *0     *0       0     0|0   100.1  100.9       0  3.4G 2.8G   0|0 27|73   39b     8k  103 dan  PRI 13:28:20
           7    12     12     *0       0     1|0   100.2  100.9       0  3.4G 2.8G   0|0 20|80   31k    35k  103 dan  PRI 13:28:22
           6    20     13     *0       0     0|0   100.2  101.0       0  3.4G 2.8G   0|0 22|78   39k    42k  103 dan  PRI 13:28:24
          *0     7      4     *0       0     1|0   100.2  101.0       0  3.4G 2.8G   0|0 20|80    9k    22k  103 dan  PRI 13:28:26
          *0    *0     *0     *0       0     0|0   100.2  101.0       0  3.4G 2.8G   0|0 21|79  921b     8k  103 dan  PRI 13:28:28
          *0    *0     *0     *0       0     0|0   100.2  101.0       0  3.4G 2.8G   0|0 21|79   39b     8k  103 dan  PRI 13:28:30
          *0    *0     *0     *0       0     1|0   100.2  101.0       0  3.4G 2.8G   0|0 21|79   95b     8k  103 dan  PRI 13:28:32
      

        1. gdb.txt
          1.56 MB
        2. gdb2.txt
          1.06 MB
        3. Screen Shot 2015-01-23 at 2.27.22 pm.png
          Screen Shot 2015-01-23 at 2.27.22 pm.png
          112 kB
        4. cache_accounting.png
          cache_accounting.png
          163 kB
        5. capped_test.js
          0.8 kB

            Assignee:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Reporter:
            mikeo@mongodb.com Michael O'Brien
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: