-
Type: Bug
-
Resolution: Won't Fix
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: LSM
-
5
A run of test/format encountered a hang, when running compact and alter commands at the same time on an LSM tree. The two relevant call stacks are:
Thread 3 (Thread 0x7f7776f7f700 (LWP 15008)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f7790afad02 in _L_lock_791 () from /lib64/libpthread.so.0 #2 0x00007f7790afac08 in __GI___pthread_mutex_lock (mutex=0x62c000000800) at pthread_mutex_lock.c:64 #3 0x0000000000629750 in __wt_spin_lock (session=0x7f779156c0c0, t=0x62c000000800) at ../src/include/mutex.i:173 #4 0x000000000061a3c5 in __lsm_tree_close (session=0x7f779156c0c0, lsm_tree=0x615000000800, final=false) at ../src/lsm/lsm_tree.c:135 #5 0x000000000061fb08 in __lsm_tree_find (session=0x7f779156c0c0, uri=0x6190001e0600 "lsm:wt", exclusive=true, treep=0x7f7776f7dea0) at ../src/lsm/lsm_tree.c:430 #6 0x000000000061e587 in __wt_lsm_tree_get (session=0x7f779156c0c0, uri=0x6190001e0600 "lsm:wt", exclusive=true, treep=0x7f7776f7dea0) at ../src/lsm/lsm_tree.c:578 #7 0x0000000000628bf7 in __wt_lsm_tree_worker (session=0x7f779156c0c0, uri=0x6190001e0600 "lsm:wt", file_func=0xacf630 <__alter_file>, name_func=0x0, cfg=0x7f7776f7ec40, open_flags=336) at ../src/lsm/lsm_tree.c:1386 #8 0x0000000000acf574 in __schema_alter (session=0x7f779156c0c0, uri=0x6190001e0600 "lsm:wt", newcfg=0x7f7776f7ec40) at ../src/schema/schema_alter.c:208 #9 0x0000000000acfe44 in __alter_tree (session=0x7f779156c0c0, name=0x6020000039f0 "colgroup:wt", newcfg=0x7f7776f7ec40) at ../src/schema/schema_alter.c:116 #10 0x0000000000ad08b1 in __alter_table (session=0x7f779156c0c0, uri=0x611000000040 "table:wt", newcfg=0x7f7776f7ec40) at ../src/schema/schema_alter.c:166 #11 0x0000000000acf606 in __schema_alter (session=0x7f779156c0c0, uri=0x611000000040 "table:wt", newcfg=0x7f7776f7ec40) at ../src/schema/schema_alter.c:211 #12 0x0000000000acf281 in __wt_schema_alter (session=0x7f779156c0c0, uri=0x611000000040 "table:wt", newcfg=0x7f7776f7ec40) at ../src/schema/schema_alter.c:227 #13 0x00000000007048cd in __session_alter (wt_session=0x7f779156c0c0, uri=0x611000000040 "table:wt", config=0x7f7776f7ed60 "access_pattern_hint=random") at ../src/session/session_api.c:689
The alter command is doing:
WT_WITHOUT_LOCKS(session, __wt_lsm_manager_clear_tree(session, lsm_tree));
From the stack trace it must be in the lock re-acquisition phase of the WT_WITHOUT_LOCKS macro.
and
The compact code is doing:
Thread 2 (Thread 0x7f777573a700 (LWP 15011)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x0000000000660246 in __wt_cond_wait_signal (session=0x7f779156e280, cond=0x60c0000025c0, usecs=10000, run_func=0x76a8b0 <__read_blocked>, signalled=0x7f7775738780) at ../src/os_posix/os_mtx_cond.c:122 #2 0x000000000076a636 in __wt_cond_wait (session=0x7f779156e280, cond=0x60c0000025c0, usecs=10000, run_func=0x76a8b0 <__read_blocked>) at ../src/include/misc.i:19 #3 0x0000000000769fe0 in __wt_readlock (session=0x7f779156e280, l=0x616000003380) at ../src/support/mtx_rw.c:257 #4 0x000000000074c963 in __wt_session_lock_dhandle (session=0x7f779156e280, flags=0, is_deadp=0x7f7775738ee0) at ../src/session/session_dhandle.c:183 #5 0x000000000074fd17 in __wt_session_get_dhandle (session=0x7f779156e280, uri=0x611000000040 "table:wt", checkpoint=0x0, cfg=0x0, flags=0) at ../src/session/session_dhandle.c:510 #6 0x00000000006dc357 in __wt_schema_get_table_uri (session=0x7f779156e280, uri=0x611000000040 "table:wt", ok_incomplete=false, flags=0, tablep=0x7f77757392a0) at ../src/schema/schema_list.c:28 #7 0x00000000006f57cf in __wt_schema_worker (session=0x7f779156e280, uri=0x611000000040 "table:wt", file_func=0x749180 <__compact_handle_append>, name_func=0x7497b0 <__compact_uri_analyze>, cfg=0x7f7775739c20, open_flags=0) at ../src/schema/schema_worker.c:97 #8 0x000000000074773c in __wt_session_compact (wt_session=0x7f779156e280, uri=0x611000000040 "table:wt", config=0x0) at ../src/session/session_compact.c:409 #9 0x0000000000518d9f in compact (arg=0x0) at ../../../test/format/compact.c:74
The job that hung was:
http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress-sanitizer/19916/