I was able to reproduce the medium-multi-lsm "hang" on the AWS HDD instance on the first attempt. It is essentially a livelock. I started the test, it ran for 70 seconds, inserting about 21M of 50M entries. Then it got 0 inserts for the next 20 minutes (while I was off doing something else). Taking it into the debugger, I see:
Several threads waiting on the schema lock
Thread holding the schema lock waiting on the dhandle lock
Eviction thread holding the dhandle lock walking the tree
Thread 14 (Thread 0x7fe27ab29700 (LWP 8175)): #0 __wt_tree_walk (session=0xe11090, refp=0x7fe240074648, flags=29) at ../src/btree/bt_walk.c:205 WT-1 0x000000000045d28f in __evict_walk_file (session=0xe11090, slotp=0x7fe27ab28d7c, flags=2) at ../src/btree/bt_evict.c:999 WT-2 0x000000000045cd89 in __evict_walk (session=0xe11090, entriesp=0x7fe27ab28dec, flags=2) at ../src/btree/bt_evict.c:906 WT-3 0x000000000045c7bb in __evict_lru (session=0xe11090, flags=2) at ../src/btree/bt_evict.c:733 WT-4 0x000000000045c18b in __evict_pass (session=0xe11090) at ../src/btree/bt_evict.c:483 WT-5 0x000000000045b391 in __evict_server (arg=0xe11090) at ../src/btree/bt_evict.c:162
I set a breakpoint up in the eviction stack, and running with breakpoints in the debugger allowed everything to make progress. If I disabled the breakpoint, it ran another 55 seconds until it hit the exact same situation. Again, stepping cleared it up and eventually it was able to finish the populate phase and complete the test.
Although in earlier email I said I thought I saw it before the merge, I am not able to reproduce it with earlier changesets (I tried master and 83cfd87, the merge of the fallocate refactoring 2 days ago). I must've been mistaken.
My theory: The cache gets full and eviction starts running. It ends up continuously looping in *evict_pass. It is able to evict a little, so WT_EVICT_NO_PROGRESS is not set. The dhandle_lock is held for the entire duration of *evict_walk. I suspect that once we get into this state, the eviction thread releases and re-acquires the dhandle lock so that every time the thread waiting for it gets to run, it is held.
Here are the other stacks:
Thread 8 (Thread 0x7fe277b23700 (LWP 8181)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x0000000000445af4 in __wt_spin_lock (session=0xe12210, t=0xe0f8a0) at ../src/include/mutex.i:174 WT-4 0x00000000004466db in __wt_session_get_btree (session=0xe12210, uri=0x7fe2384d9480 "file:test01-000017.lsm", checkpoint=0x7fe238028c50 "WiredTigerCheckpoint.1", cfg=0x7fe277b22da0, flags=0) at ../src/session/session_dhandle.c:396 WT-5 0x000000000044625b in __wt_session_get_btree_ckpt (session=0xe12210, uri=0x7fe2384d9480 "file:test01-000017.lsm", cfg=0x7fe277b22da0, flags=0) at ../src/session/session_dhandle.c:264 WT-6 0x0000000000491577 in __wt_curfile_open (session=0xe12210, uri=0x7fe2384d9480 "file:test01-000017.lsm", owner=0x0, cfg=0x7fe277b22da0, cursorp=0x7fe277b22e38) at ../src/cursor/cur_file.c:460 WT-7 0x00000000004415c1 in __wt_open_cursor (session=0xe12210, uri=0x7fe2384d9480 "file:test01-000017.lsm", owner=0x0, cfg=0x7fe277b22da0, cursorp=0x7fe277b22e38) at ../src/session/session_api.c:275 WT-8 0x00000000004a7c75 in __wt_lsm_merge (session=0xe12210, lsm_tree=0xeeb410, id=2) at ../src/lsm/lsm_merge.c:433 Thread 7 (Thread 0x7fe277322700 (LWP 8182)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x00000000004a9c6e in __wt_spin_lock (session=0xe12490, t=0xe0f960) at ../src/include/mutex.i:174 WT-4 0x00000000004aa7bb in __wt_lsm_checkpoint_chunk (session=0xe12490, lsm_tree=0xe4e200, chunk=0x7fe240069130) at ../src/lsm/lsm_work_unit.c:290 WT-5 0x0000000000429242 in __lsm_worker_general_op (session=0xe12490, cookie=0xe10568, completed=0x7fe277321ee0) at ../src/lsm/lsm_worker.c:64 Thread 6 (Thread 0x7fe276b21700 (LWP 8183)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x0000000000445af4 in __wt_spin_lock (session=0xe12710, t=0xe0f8a0) at ../src/include/mutex.i:174 WT-4 0x00000000004466db in __wt_session_get_btree (session=0xe12710, uri=0x7fe2540d13b0 "file:test00-000016.lsm", checkpoint=0x7fe2540d0b30 "WiredTigerCheckpoint.1", cfg=0x7fe276b20da0, flags=0) at ../src/session/session_dhandle.c:396 WT-5 0x000000000044625b in __wt_session_get_btree_ckpt (session=0xe12710, uri=0x7fe2540d13b0 "file:test00-000016.lsm", cfg=0x7fe276b20da0, flags=0) at ../src/session/session_dhandle.c:264 WT-6 0x0000000000491577 in __wt_curfile_open (session=0xe12710, uri=0x7fe2540d13b0 "file:test00-000016.lsm", owner=0x0, cfg=0x7fe276b20da0, cursorp=0x7fe276b20e38) at ../src/cursor/cur_file.c:460 WT-7 0x00000000004415c1 in __wt_open_cursor (session=0xe12710, uri=0x7fe2540d13b0 "file:test00-000016.lsm", owner=0x0, cfg=0x7fe276b20da0, cursorp=0x7fe276b20e38) at ../src/session/session_api.c:275 WT-8 0x00000000004a7c75 in __wt_lsm_merge (session=0xe12710, lsm_tree=0xe4e200, id=4) at ../src/lsm/lsm_merge.c:433 Thread 5 (Thread 0x7fe276320700 (LWP 8184)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x0000000000445af4 in __wt_spin_lock (session=0xe12990, t=0xe0f8a0) at ../src/include/mutex.i:174 WT-4 0x00000000004466db in __wt_session_get_btree (session=0xe12990, uri=0x7fe240010130 "file:test03-000021.lsm", checkpoint=0x0, cfg=0x0, flags=0) at ../src/session/session_dhandle.c:396 WT-5 0x000000000044625b in __wt_session_get_btree_ckpt (session=0xe12990, uri=0x7fe240010130 "file:test03-000021.lsm", cfg=0x0, flags=0) at ../src/session/session_dhandle.c:264 WT-6 0x000000000043fc3a in __wt_schema_worker (session=0xe12990, uri=0x7fe240010130 "file:test03-000021.lsm", file_func=0x44fd38 <__wt_checkpoint>, name_func=0x0, cfg=0x0, open_flags=0) at ../src/schema/schema_worker.c:57 WT-7 0x00000000004aa7fd in __wt_lsm_checkpoint_chunk (session=0xe12990, lsm_tree=0xeef010, chunk=0x7fe24002ae70) at ../src/lsm/lsm_work_unit.c:290 Thread 4 (Thread 0x7fe275b1f700 (LWP 8185)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x0000000000445af4 in __wt_spin_lock (session=0xe12c10, t=0xe0f8a0) at ../src/include/mutex.i:174 WT-4 0x00000000004466db in __wt_session_get_btree (session=0xe12c10, uri=0x7fe24cc250e0 "file:test03-000016.lsm", checkpoint=0x7fe24cc248a0 "WiredTigerCheckpoint.1", cfg=0x7fe275b1eda0, flags=0) at ../src/session/session_dhandle.c:396 WT-5 0x000000000044625b in __wt_session_get_btree_ckpt (session=0xe12c10, uri=0x7fe24cc250e0 "file:test03-000016.lsm", cfg=0x7fe275b1eda0, flags=0) at ../src/session/session_dhandle.c:264 WT-6 0x0000000000491577 in __wt_curfile_open (session=0xe12c10, uri=0x7fe24cc250e0 "file:test03-000016.lsm", owner=0x0, cfg=0x7fe275b1eda0, cursorp=0x7fe275b1ee38) at ../src/cursor/cur_file.c:460 WT-7 0x00000000004415c1 in __wt_open_cursor (session=0xe12c10, uri=0x7fe24cc250e0 "file:test03-000016.lsm", owner=0x0, cfg=0x7fe275b1eda0, cursorp=0x7fe275b1ee38) at ../src/session/session_api.c:275 WT-8 0x00000000004a7c75 in __wt_lsm_merge (session=0xe12c10, lsm_tree=0xeef010, id=6) at ../src/lsm/lsm_merge.c:433 Thread 3 (Thread 0x7fe27531e700 (LWP 8186)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x00000000004a9c6e in __wt_spin_lock (session=0xe12e90, t=0xe0f960) at ../src/include/mutex.i:174 WT-4 0x00000000004aa7bb in __wt_lsm_checkpoint_chunk (session=0xe12e90, lsm_tree=0xeed2f0, chunk=0x7fe2400799b0) at ../src/lsm/lsm_work_unit.c:290 Thread 2 (Thread 0x7fe274b1d700 (LWP 8187)): #0 0x00007fe27ba28265 in __lll_lock_wait () from /lib64/libpthread.so.0 WT-1 0x00007fe27ba23dc1 in _L_lock_816 () from /lib64/libpthread.so.0 WT-2 0x00007fe27ba23cc7 in pthread_mutex_lock () from /lib64/libpthread.so.0 WT-3 0x00000000004a049c in __wt_spin_lock (session=0xe11a90, t=0xe0f960) at ../src/include/mutex.i:174 WT-4 0x00000000004a13de in __clsm_enter (clsm=0x7fe268001ae0, reset=0, update=1) at ../src/lsm/lsm_cursor.c:204 WT-5 0x00000000004a540a in __clsm_insert (cursor=0x7fe268001ae0) at ../src/lsm/lsm_cursor.c:1330 WT-6 0x0000000000406cc9 in populate_thread (arg=0xed3050) at ../../../bench/wtperf/wtperf.c:835
- is related to
-
WT-1400 Fix some issues with the recent changes to handle locking
- Closed
- related to
-
WT-1 placeholder WT-1
- Closed
-
WT-2 What does metadata look like?
- Closed
-
WT-3 What file formats are required?
- Closed
-
WT-4 Flexible cursor traversals
- Closed
-
WT-5 How does pget work: is it necessary?
- Closed
-
WT-6 Complex schema example
- Closed
-
WT-7 Do we need the handle->err/errx methods?
- Closed
-
WT-8 Do we need table load, bulk-load and/or dump methods?
- Closed