Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16065

Long flush pauses in MMAPv1

    • Fully Compatible
    • ALL

      Reducing contention on GroupCommit mutex (SERVER-15950) exposed a problem in flushing code.
      The attached VTune lock profiler analysis shows that most of the actual lock wait contention (time-wise) is because of long flush times, as visible from the following callstacks:

      Waiting Call Stack
      1 of 7: 92.8% (1258.626s of 1355.788s)
      
      mongod!boost::condition_variable::timed_wait+0xf4 - condition_variable.hpp:74
      mongod!timed_wait<boost::date_time::subsecond_duration<boost::posix_time::time_duration,
      1000ll> >+0x296 - condition_variable_fwd.hpp:72
      mongod!mongo::CondVarLockGrantNotification::wait+0xa8 - lock_state.cpp:270
      mongod!mongo::LockerImpl<(bool)1>::lock+0x210 - lock_state.cpp:454
      mongod!mongo::LockerImpl<(bool)1>::endWriteUnitOfWork+0xfc - lock_state.cpp:407
      mongod!mongo::WriteUnitOfWork::commit+0x3b - operation_context.h:176
      mongod!singleInsert+0x1bf - batch_executor.cpp:1110
      mongod!insertOne+0x1a0 - batch_executor.cpp:1038
      mongod!mongo::WriteBatchExecutor::execOneInsert+0x5f - batch_executor.cpp:1063
      mongod!mongo::WriteBatchExecutor::execInserts+0x22c - batch_executor.cpp:860
      mongod!mongo::WriteBatchExecutor::bulkExecute+0x33 - batch_executor.cpp:754
      mongod!mongo::WriteBatchExecutor::executeBatch+0x3a4 - batch_executor.cpp:263
      mongod!mongo::WriteCmd::run+0x168 - write_commands.cpp:143
      mongod!mongo::_execCommand+0x33 - dbcommands.cpp:1160
      mongod!mongo::Command::execCommand+0xc50 - dbcommands.cpp:1374
      mongod!mongo::_runCommands+0x1ef - dbcommands.cpp:1450
      mongod!runCommands+0x23 - new_find.cpp:131
      mongod!mongo::newRunQuery+0xff9 - new_find.cpp:551
      mongod!receivedQuery+0x1f2 - instance.cpp:220
      mongod!mongo::assembleResponse+0x9d0 - instance.cpp:393
      mongod!mongo::MyMessageHandler::process+0xdf - db.cpp:185
      mongod!mongo::PortMessageServer::handleIncomingMsg+0x420 -
      message_server_port.cpp:234
      libpthread-2.18.so!start_thread+0xc2 - [Unknown]:[Unknown]
      libc-2.18.so!__clone+0x6c - [Unknown]:[Unknown]
      
      Waiting Call Stack
      2 of 7: 7.1% (95.964s of 1355.788s)
      
      mongod!boost::condition_variable::timed_wait+0xf4 - condition_variable.hpp:74
      mongod!timed_wait<boost::date_time::subsecond_duration<boost::posix_time::time_duration,
      1000ll> >+0x296 - condition_variable_fwd.hpp:72
      mongod!mongo::CondVarLockGrantNotification::wait+0xa8 - lock_state.cpp:270
      mongod!mongo::LockerImpl<(bool)1>::lock+0x210 - lock_state.cpp:454
      mongod!mongo::LockerImpl<(bool)1>::lockGlobal+0x1ad - lock_state.cpp:338
      mongod!mongo::Lock::DBLock::lockDB+0x2c - d_concurrency.cpp:227
      mongod!mongo::Lock::DBLock::DBLock+0xcd - d_concurrency.cpp:217
      mongod!mongo::WriteBatchExecutor::ExecInsertsState::_lockAndCheckImpl+0x16d
      - batch_executor.cpp:954
      mongod!lockAndCheck+0x101 - batch_executor.cpp:1009
      mongod!insertOne+0x8f - batch_executor.cpp:1037
      mongod!mongo::WriteBatchExecutor::execOneInsert+0x5f - batch_executor.cpp:1063
      mongod!mongo::WriteBatchExecutor::execInserts+0x22c - batch_executor.cpp:860
      mongod!mongo::WriteBatchExecutor::bulkExecute+0x33 - batch_executor.cpp:754
      mongod!mongo::WriteBatchExecutor::executeBatch+0x3a4 - batch_executor.cpp:263
      mongod!mongo::WriteCmd::run+0x168 - write_commands.cpp:143
      mongod!mongo::_execCommand+0x33 - dbcommands.cpp:1160
      mongod!mongo::Command::execCommand+0xc50 - dbcommands.cpp:1374
      mongod!mongo::_runCommands+0x1ef - dbcommands.cpp:1450
      mongod!runCommands+0x23 - new_find.cpp:131
      mongod!mongo::newRunQuery+0xff9 - new_find.cpp:551
      mongod!receivedQuery+0x1f2 - instance.cpp:220
      mongod!mongo::assembleResponse+0x9d0 - instance.cpp:393
      mongod!mongo::MyMessageHandler::process+0xdf - db.cpp:185
      mongod!mongo::PortMessageServer::handleIncomingMsg+0x420 -
      message_server_port.cpp:234
      libpthread-2.18.so!start_thread+0xc2 - [Unknown]:[Unknown]
      libc-2.18.so!__clone+0x6c - [Unknown]:[Unknown]
      

      This is not an actual lock contention, but just long flush times. kaloian.manassiev suggests this could be made better by flushing more often and some other small optimizations.

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            davide.italiano Davide Italiano
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: