Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16562

Extremely slow shutdown for WiredTiger (non-LSM)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.8.0-rc4
    • Affects Version/s: 2.8.0-rc2
    • Component/s: Concurrency, Storage
    • None
    • Fully Compatible
    • ALL
    • Hide

      command line for starting mongod:
      bin.tcm/mongod --config /data/mysql/mongo/mongo.conf --storageEngine wiredTiger

      configuration file:
      processManagement:
      fork: true
      systemLog:
      destination: file
      path: /data/mysql/mongo/log
      logAppend: true
      storage:
      dbPath: /data/mysql/mongo/data
      journal:
      enabled: true
      mmapv1:
      syncPeriodSecs: 60
      journal:
      commitIntervalMs: 100

      Test case is the insert benchmark changed to insert 2B rows (200M rows per thread, 10 insert threads) with a 2000/byte char field.
      export MAX_ROWS=200000000
      export LENGTH_CHAR_FIELDS=2000

      Show
      command line for starting mongod: bin.tcm/mongod --config /data/mysql/mongo/mongo.conf --storageEngine wiredTiger configuration file: processManagement: fork: true systemLog: destination: file path: /data/mysql/mongo/log logAppend: true storage: dbPath: /data/mysql/mongo/data journal: enabled: true mmapv1: syncPeriodSecs: 60 journal: commitIntervalMs: 100 Test case is the insert benchmark changed to insert 2B rows (200M rows per thread, 10 insert threads) with a 2000/byte char field. export MAX_ROWS=200000000 export LENGTH_CHAR_FIELDS=2000

      I loaded 2B docs via the insert benchmark to create a 300G database last night. The load stopped more than 8 hours ago. This morning I did "kill $( pidof mongod )" to shutdown mongod and the shutdown took 12 minutes.

      This reproduces with the default (non-LSM) engine for WiredTiger. This does not reproduce with the LSM engine. This is the thread stack for the thread doing all of the work at shutdown. My host has very fast PCIe-flash for storage.

      Looks like a long checkpoint to me:
      _free_update_list,free_skip_list,free_skip_array,free_page_row_leaf,wt_page_out,wt_ref_out,wt_rec_page_clean_update,evict_file,wt_cache_op,wt_checkpoint_close,wt_conn_btree_sync_and_close,wt_conn_dhandle_discard_single,wt_conn_dhandle_discard,wt_connection_close,_conn_close,mongo::WiredTigerKVEngine::cleanShutdown,shutdownServer,mongo::exitCleanly,mongo::(anonymous,boost::(anonymous,start_thread,clone

      During the shutdown this error message was printed a few times:

      2014-12-16T07:59:02.232-0800 I - [TTLMonitor] LockerId 411893 has been waiting to acquire lock for more than 30 seconds. MongoDB will print the lock manager state and the stack of the thread that
      has been waiting, for diagnostic purposes. This message does not necessary imply that the server is experiencing an outage, but might be an indication of an overload.
      2014-12-16T07:59:02.232-0800 I - [TTLMonitor] Dumping LockManager @ 0x1ad5d20
      2014-12-16T07:59:02.232-0800 I - [TTLMonitor] Lock @ 0x2903580:

      {2305843009213693953: Global, 1}

      GRANTED:
      LockRequest 0 @ 0x7F9C8B743350: Mode = X; ConvertMode = NONE; EnqueueAtFront = 1; CompatibleFirst = 1;

      PENDING:
      LockRequest 411893 @ 0x2BC8A00: Mode = IS; ConvertMode = NONE; EnqueueAtFront = 0; CompatibleFirst = 0;
      2014-12-16T07:59:02.232-0800 I - [TTLMonitor]
      2014-12-16T07:59:02.235-0800 I - [TTLMonitor]
      0xf06559 0x99c74a 0x9a41cc 0x99d44b 0x993520 0x994ed0 0xd40012 0xd41392 0xe98ec0 0xf528f4 0x7f9c92c48fa8 0x7f9c91d2a5ad

            Assignee:
            mark.benvenuto@mongodb.com Mark Benvenuto
            Reporter:
            mdcallag Mark Callaghan
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: