Details
-
Improvement
-
Resolution: Won't Fix
-
Major - P3
-
None
-
None
-
Fully Compatible
-
v3.4
-
Storage 2017-11-13
Description
With the current implementation of the size storer
There is one thread in:
Thread 410 (Thread 0x7fb7892de700 (LWP 2741)):
|
#0 0x00007fb794c2fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
|
#1 0x0000000001a6813b in __wt_cond_wait_signal ()
|
#2 0x0000000001a3e79c in __wt_cache_eviction_worker ()
|
#3 0x0000000001a9ab16 in __session_begin_transaction ()
|
#4 0x00000000010a8775 in mongo::WiredTigerSizeStorer::syncCache(bool) ()
|
#5 0x00000000010908a6 in mongo::WiredTigerKVEngine::syncSizeInfo(bool) const ()
|
#6 0x0000000001091282 in mongo::WiredTigerKVEngine::haveDropsQueued() const ()
|
#7 0x00000000010a7299 in mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*) ()
|
#8 0x00000000010a779d in mongo::WiredTigerSessionCache::waitUntilDurable(bool) ()
|
#9 0x0000000001094bcb in mongo::WiredTigerKVEngine::WiredTigerJournalFlusher::run() ()
|
#10 0x00000000012b6c70 in mongo::BackgroundJob::jobBody() ()
|
i.e: It holds the WiredTigerSizeStorer cursor lock, and is blocked waiting on space in the WiredTiger cache.
There are 34 other threads waiting for the WiredTigerSizeStorer cursor lock:
(NB: This call stack is from a 3.2 mongod)
Thread 322 (Thread 0x7fb768a65700 (LWP 6178)):
|
#0 0x00007fb794c31f4d in __lll_lock_wait () from /lib64/libpthread.so.0
|
#1 0x00007fb794c2dd02 in _L_lock_791 () from /lib64/libpthread.so.0
|
#2 0x00007fb794c2dc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
|
#3 0x00000000010a8431 in mongo::WiredTigerSizeStorer::syncCache(bool) ()
|
#4 0x00000000010908a6 in mongo::WiredTigerKVEngine::syncSizeInfo(bool) const ()
|
#5 0x0000000001091282 in mongo::WiredTigerKVEngine::haveDropsQueued() const ()
|
#6 0x00000000010a7299 in mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*) ()
|
#7 0x00000000010a4d57 in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() ()
|
#8 0x00000000010a4de1 in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() ()
|
#9 0x0000000000d20640 in mongo::OperationContextImpl::~OperationContextImpl() ()
|
#10 0x0000000000d206d1 in mongo::OperationContextImpl::~OperationContextImpl() ()
|
#11 0x00000000009b76e7 in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) ()
|
So we are essentially serializing threads through size storer sync. The thread holding the lock is waiting for space in the cache - which may block for an extended time.
Ideally in this case the 34 threads waiting for the lock would skip flushing size storer - they don't need it to be flushed for correctness.