Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28589

Reduce how often threads wait before flushing sizeStorer with WiredTiger

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Storage, WiredTiger
    • Fully Compatible
    • v3.4
    • Storage 2017-11-13

    Description

      With the current implementation of the size storer

      There is one thread in:

      Thread 410 (Thread 0x7fb7892de700 (LWP 2741)):
      #0  0x00007fb794c2fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x0000000001a6813b in __wt_cond_wait_signal ()
      #2  0x0000000001a3e79c in __wt_cache_eviction_worker ()
      #3  0x0000000001a9ab16 in __session_begin_transaction ()
      #4  0x00000000010a8775 in mongo::WiredTigerSizeStorer::syncCache(bool) ()
      #5  0x00000000010908a6 in mongo::WiredTigerKVEngine::syncSizeInfo(bool) const ()
      #6  0x0000000001091282 in mongo::WiredTigerKVEngine::haveDropsQueued() const ()
      #7  0x00000000010a7299 in mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*) ()
      #8  0x00000000010a779d in mongo::WiredTigerSessionCache::waitUntilDurable(bool) ()
      #9  0x0000000001094bcb in mongo::WiredTigerKVEngine::WiredTigerJournalFlusher::run() ()
      #10 0x00000000012b6c70 in mongo::BackgroundJob::jobBody() ()
      

      i.e: It holds the WiredTigerSizeStorer cursor lock, and is blocked waiting on space in the WiredTiger cache.

      There are 34 other threads waiting for the WiredTigerSizeStorer cursor lock:
      (NB: This call stack is from a 3.2 mongod)

      Thread 322 (Thread 0x7fb768a65700 (LWP 6178)):
      #0  0x00007fb794c31f4d in __lll_lock_wait () from /lib64/libpthread.so.0
      #1  0x00007fb794c2dd02 in _L_lock_791 () from /lib64/libpthread.so.0
      #2  0x00007fb794c2dc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #3  0x00000000010a8431 in mongo::WiredTigerSizeStorer::syncCache(bool) ()
      #4  0x00000000010908a6 in mongo::WiredTigerKVEngine::syncSizeInfo(bool) const ()
      #5  0x0000000001091282 in mongo::WiredTigerKVEngine::haveDropsQueued() const ()
      #6  0x00000000010a7299 in mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*) ()
      #7  0x00000000010a4d57 in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() ()
      #8  0x00000000010a4de1 in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() ()
      #9  0x0000000000d20640 in mongo::OperationContextImpl::~OperationContextImpl() ()
      #10 0x0000000000d206d1 in mongo::OperationContextImpl::~OperationContextImpl() ()
      #11 0x00000000009b76e7 in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) ()
      

      So we are essentially serializing threads through size storer sync. The thread holding the lock is waiting for space in the cache - which may block for an extended time.

      Ideally in this case the 34 threads waiting for the lock would skip flushing size storer - they don't need it to be flushed for correctness.

      Attachments

        Activity

          People

            xiangyu.yao@mongodb.com Xiangyu Yao (Inactive)
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: