Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-19854

Mongod failed to open connection, remained in hung state, when running WT, during recovery

    • ALL
    • Hide

      Command to run this test:

      powertest.sh -s <ipaddr>  -c 20 -v 3 -D -A -N 10 -l 3  -F <mongo root> -R -i 10000000
      
      Show
      Command to run this test: powertest.sh -s <ipaddr> -c 20 -v 3 -D -A -N 10 -l 3 -F <mongo root> -R -i 10000000

      The powercycle test was applied to WiredTiger, with a seed of 10M documents. After 1 loop of start/crash/start, the connection was not made available, with mongod still active, during recovery phase.

      Server log:

      2015-08-10T15:59:41.879-0400 I CONTROL  [main] ***** SERVER RESTARTED *****
      2015-08-10T15:59:41.955-0400 W -        [initandlisten] Detected unclean shutdown - /home/jonathan/pt-317-wt/data/wiredTiger/mongod.lock is not empty.
      2015-08-10T15:59:41.955-0400 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
      2015-08-10T15:59:41.955-0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      

      Mongod threads:

      jonathan@CAP-JA-Ubuntu14:~$ sudo gdb -p 1817 -batch -ex "thread apply all bt"
      [New LWP 7290]
      [New LWP 1819]
      [New LWP 1818]
      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      0x00007f35c294acc3 in pread64 () at ../sysdeps/unix/syscall-template.S:81
      81	../sysdeps/unix/syscall-template.S: No such file or directory.
      
      Thread 4 (Thread 0x7f35c22ac700 (LWP 1818)):
      #0  0x00007f35c294b0d1 in do_sigwait (sig=0x7f35c22ab8fc, set=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:60
      #1  __sigwait (set=0x24bf980 <mongo::(anonymous namespace)::asyncSignals>, sig=0x7f35c22ab8fc) at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:97
      #2  0x00000000011d86b6 in mongo::(anonymous namespace)::signalProcessingThread() ()
      #3  0x00000000019cf600 in execute_native_thread_routine ()
      #4  0x00007f35c2943182 in start_thread (arg=0x7f35c22ac700) at pthread_create.c:312
      #5  0x00007f35c267047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      
      Thread 3 (Thread 0x7f35c1aab700 (LWP 1819)):
      #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
      #1  0x000000000192e4d6 in __wt_cond_wait ()
      #2  0x0000000001912fb6 in __evict_server ()
      #3  0x00007f35c2943182 in start_thread (arg=0x7f35c1aab700) at pthread_create.c:312
      #4  0x00007f35c267047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      
      Thread 2 (Thread 0x7f35c12aa700 (LWP 7290)):
      #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
      #1  0x000000000192e4d6 in __wt_cond_wait ()
      #2  0x0000000001912d22 in __evict_worker ()
      #3  0x00007f35c2943182 in start_thread (arg=0x7f35c12aa700) at pthread_create.c:312
      #4  0x00007f35c267047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      
      Thread 1 (Thread 0x7f35c398fcc0 (LWP 1817)):
      #0  0x00007f35c294acc3 in pread64 () at ../sysdeps/unix/syscall-template.S:81
      #1  0x000000000192f886 in __wt_read ()
      #2  0x00000000018ad72e in __wt_block_read_off ()
      #3  0x00000000018ad9c4 in __wt_bm_read ()
      #4  0x00000000018c7f56 in __wt_bt_read ()
      #5  0x00000000018cce88 in __wt_cache_read ()
      #6  0x00000000018c9f2b in __wt_page_in_func ()
      #7  0x00000000018e6209 in __wt_row_search ()
      #8  0x00000000018bfb36 in __wt_btcur_remove ()
      #9  0x00000000018fd96b in __curfile_remove ()
      #10 0x0000000001968a90 in __txn_log_recover ()
      #11 0x0000000001918337 in __wt_log_scan ()
      #12 0x0000000001969556 in __wt_txn_recover ()
      #13 0x00000000018f5896 in __wt_connection_workers ()
      #14 0x00000000018ef4ae in wiredtiger_open ()
      #15 0x0000000000f79157 in mongo::WiredTigerKVEngine::WiredTigerKVEngine(std::string const&, std::string const&, bool, bool) ()
      #16 0x0000000000f7653e in mongo::(anonymous namespace)::WiredTigerFactory::create(mongo::StorageGlobalParams const&, mongo::StorageEngineLockFile const&) const ()
      #17 0x0000000000e9f339 in mongo::ServiceContextMongoD::initializeGlobalStorageEngine() ()
      #18 0x00000000008d73b0 in mongo::initAndListen(int) ()
      #19 0x00000000008dc044 in main ()
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            jonathan.abrahams Jonathan Abrahams
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: