Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17293

Server crash setting wiredTigerEngineRuntimeConfig:"eviction=(threads_max=8)"

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-rc8
    • Fix Version/s: 3.0.1, 3.1.0
    • Component/s: Storage
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:

      Description

      Tried to set wiredTigerEngineRuntimeConfig parameter to adjust eviction threads to 8 on a secondary server which was running under some load. Will work on a more controlled repro.

      Server came up afterwards without issue.

      copy/paste of shell:

      [jmorales@mms-db-2.ny1 ~/queues.20150214.wt.secondary/eviction-threads-8]$ /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-3.0.0-rc9-pre-2/bin/mongo --port 27800
      MongoDB shell version: 3.0.0-rc9-pre-
      connecting to: 127.0.0.1:27800/test
      queues:SECONDARY> db.adminCommand({setParameter:1, wiredTigerEngineRuntimeConfig:"eviction=(threads_max=8)"})
      2015-02-15T00:03:45.238+0000 I NETWORK  DBClientCursor::init call() failed
      2015-02-15T00:03:45.247+0000 E QUERY    Error: error doing query: failed
          at DBQuery._exec (src/mongo/shell/query.js:83:36)
          at DBQuery.hasNext (src/mongo/shell/query.js:240:10)
          at DBCollection.findOne (src/mongo/shell/collection.js:186:19)
          at DB.runCommand (src/mongo/shell/db.js:58:41)
          at DB.adminCommand (src/mongo/shell/db.js:66:41)
          at (shell):1:4 at src/mongo/shell/query.js:83

      mongod log:

      2015-02-15T00:03:34.192+0000 I STORAGE  [conn8644] Reconfiguring WiredTiger storage engine with config string: "eviction=(threads_max=8)"
      2015-02-15T00:03:35.041+0000 I COMMAND  [conn8596] command local.$cmd command: collStats { collStats: "oplog.rs", scale: undefined } keyUpdates:0 writeConflicts:0 numYields:0 reslen:4351 locks:{} 325ms
      2015-02-15T00:03:36.410+0000 I COMMAND  [conn8596] command local.$cmd command: collStats { collStats: "oplog.rs", scale: undefined } keyUpdates:0 writeConflicts:0 numYields:0 reslen:4351 locks:{} 366ms
      2015-02-15T00:03:36.455+0000 I COMMAND  [conn8498] command admin.$cmd command: serverStatus { serverStatus: 1, oplog: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:15355 locks:{} 176ms
      2015-02-15T00:03:37.370+0000 F -        Invalid access at address: 0
      2015-02-15T00:03:37.377+0000 F -        Got signal: 11 (Segmentation fault).
       
       0xf41659 0xf40cd2 0xf4102e 0x3f6ea0f710 0x1324060 0x3f6ea079d1 0x3f6e6e8b6d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B41659"},{"b":"400000","o":"B40CD2"},{"b":"400000","o":"B4102E"},{"b":"3F6EA00000","o":"F710"},{"b":"400000","o":"F24060"},{"b":"3F6EA00000","o":"79D1"},{"b":"3F6E600000","o":"E8B6D"}],"processInfo":{ "mongodbVersion" : "3.0.0-rc9-pre-", "gitVersion" : "79492d9cc1885d74b31b5
      fe24194dbc227096d6e", "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.5.1.el6.x86_64", "version" : "#1 SMP Wed Feb 12 00:41:43 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF340EF000", "elfType" : 3 }, { "path" : "/lib64/libpthread.so.0", "elfType" : 
      3 }, { "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "path" : "/lib64/libc.so.6", "elfType" : 3 }, 
      { "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf41659]
       mongod(+0xB40CD2) [0xf40cd2]
       mongod(+0xB4102E) [0xf4102e]
       libpthread.so.0(+0xF710) [0x3f6ea0f710]
       mongod(+0xF24060) [0x1324060]
       libpthread.so.0(+0x79D1) [0x3f6ea079d1]
       libc.so.6(clone+0x6D) [0x3f6e6e8b6d]
      -----  END BACKTRACE  -----
      2015-02-15T00:03:46.026+0000 I CONTROL  ***** SERVER RESTARTED *****

        Activity

        Hide
        pasette Dan Pasette added a comment -

        Eitan Klein also repro'd this last night after doing a long stress test run. Looks like the evict_worker thread is being initialized with a NULL session pointer.

         

        __evict_worker(void *arg)
        {
        WT_CACHE *cache;
        WT_CONNECTION_IMPL *conn;
        WT_DECL_RET;
        WT_EVICT_WORKER *worker;
        WT_SESSION_IMPL *session;

        worker = arg;
        session = worker->session;
        conn = S2C(session);
        cache = conn->cache; <===

        0:078> k
        Child-SP RetAddr Call Site
        0000003e`3a1afb30 00007ffe`859915cd mongod!__evict_worker+0x1a [c:\data\mci\shell\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 355]
        0000003e`3a1afb60 00007ffe`877e43d1 KERNEL32!BaseThreadInitThunk+0xd
        0000003e`3a1afb90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

        Show
        pasette Dan Pasette added a comment - Eitan Klein also repro'd this last night after doing a long stress test run. Looks like the evict_worker thread is being initialized with a NULL session pointer.   __evict_worker(void *arg) { WT_CACHE *cache; WT_CONNECTION_IMPL *conn; WT_DECL_RET; WT_EVICT_WORKER *worker; WT_SESSION_IMPL *session; worker = arg; session = worker->session; conn = S2C(session); cache = conn->cache; <=== 0:078> k Child-SP RetAddr Call Site 0000003e`3a1afb30 00007ffe`859915cd mongod!__evict_worker+0x1a [c:\data\mci\shell\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 355] 0000003e`3a1afb60 00007ffe`877e43d1 KERNEL32!BaseThreadInitThunk+0xd 0000003e`3a1afb90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
        Hide
        pasette Dan Pasette added a comment - - edited

        Eitan Klein also repro'd this last night after doing a long stress test run against RC8.
        Looks like the evict_worker thread is being initialized with a NULL session pointer.

        __evict_worker(void *arg)
        {
                WT_CACHE *cache;
                WT_CONNECTION_IMPL *conn;
                WT_DECL_RET;
                WT_EVICT_WORKER *worker;
                WT_SESSION_IMPL *session;
         
                worker = arg;
                session = worker->session;
                conn = S2C(session);
                cache = conn->cache;  <== null pointer
         
        Child-SP          RetAddr           Call Site
        0000003e`3a1afb30 00007ffe`859915cd mongod!__evict_worker+0x1a [c:\data\mci\shell\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 355]
        0000003e`3a1afb60 00007ffe`877e43d1 KERNEL32!BaseThreadInitThunk+0xd
        0000003e`3a1afb90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

        Show
        pasette Dan Pasette added a comment - - edited Eitan Klein also repro'd this last night after doing a long stress test run against RC8. Looks like the evict_worker thread is being initialized with a NULL session pointer. __evict_worker(void *arg) { WT_CACHE *cache; WT_CONNECTION_IMPL *conn; WT_DECL_RET; WT_EVICT_WORKER *worker; WT_SESSION_IMPL *session;   worker = arg; session = worker->session; conn = S2C(session); cache = conn->cache; <== null pointer   Child-SP RetAddr Call Site 0000003e`3a1afb30 00007ffe`859915cd mongod!__evict_worker+0x1a [c:\data\mci\shell\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 355] 0000003e`3a1afb60 00007ffe`877e43d1 KERNEL32!BaseThreadInitThunk+0xd 0000003e`3a1afb90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
        Hide
        michael.cahill Michael Cahill added a comment -
        Show
        michael.cahill Michael Cahill added a comment - A fix is in review at https://github.com/wiredtiger/wiredtiger/pull/1680

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: