-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
I ran test/format on a small (100MB) file system to force ENOSPC. My configuration had logging enabled (i.e., logging=1 in CONFIG). After running out of space, format hung. First it printed the following WT error messages:
[1598548238:109309][17939:0x7ff8e1aa8700], WT_SESSION.commit_transaction: __posix_std_fallocate, 58: /mnt/test/RUNDIR/./WiredTigerLog.0000000008: fallocate:: No space left on device [1598548238:609985][17939:0x7ff8e1aa8700], WT_SESSION.commit_transaction: __posix_sys_fallocate, 75: /mnt/test/RUNDIR/./WiredTigerLog.0000000008: fallocate:: No space left on device [1598548239:110630][17939:0x7ff8e1aa8700], WT_SESSION.commit_transaction: __posix_posix_fallocate, 92: /mnt/test/RUNDIR/./WiredTigerLog.0000000008: fallocate:: No space left on device [1598548240:99223][17939:0x7ff8e12a7700], WT_SESSION.commit_transaction: __posix_open_file, 804: /mnt/test/RUNDIR/./WiredTigerTmplog.0000000008: handle-open: open: No space left on device
I had run this test 6-10 times before this without seeing this problem. I think (but I'm not certain) this failure was the first time I saw a "No space left on device" error from opening a file.
Attaching with GDB shows that all of the threads from format are in commit_transaction(). Most appear blocked in __wt_log_slot_join():
#0 0x00007ff8e735713f in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0,
writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0,
timeout=timeout@entry=0x7ff8e2aa9a00) at ../sysdeps/unix/sysv/linux/select.c:41
#1 0x00000000004a2941 in __wt_sleep (seconds=seconds@entry=0,
micro_seconds=micro_seconds@entry=1000) at ../src/os_posix/os_sleep.c:30
#2 0x000000000048883c in __wt_log_slot_join (session=session@entry=0x1b38760,
mysize=mysize@entry=128, flags=flags@entry=0, myslot=myslot@entry=0x7ff8e2aa9b50)
at ../src/log/log_slot.c:593
#3 0x000000000048423b in __log_write_internal (flags=0, lsnp=0x0, record=0x3e85650,
session=0x1b38760) at ../src/log/log.c:2666
#4 __wt_log_write (session=session@entry=0x1b38760, record=<optimized out>, lsnp=lsnp@entry=0x0,
flags=0) at ../src/log/log.c:2569
#5 0x00000000004fa57a in __wt_txn_log_commit (session=session@entry=0x1b38760,
cfg=cfg@entry=0x7ff8e2aa9cb0) at ../src/txn/txn_log.c:325
#6 0x00000000004f30c1 in __wt_txn_commit (session=session@entry=0x1b38760,
cfg=cfg@entry=0x7ff8e2aa9cb0) at ../src/txn/txn.c:1390
#7 0x00000000004cf49a in __session_commit_transaction (wt_session=0x1b38760,
config=<optimized out>) at ../src/session/session_api.c:1652
#8 0x00000000004114fe in commit_transaction (tinfo=tinfo@entry=0x3bb6000,
prepared=prepared@entry=false) at ../../../test/format/ops.c:550
#9 0x0000000000413c71 in ops (arg=0x3bb6000) at ../../../test/format/ops.c:1086
#10 0x00007ff8e81646db in start_thread (arg=0x7ff8e2aaa700) at pthread_create.c:463
#11 0x00007ff8e7361a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
One thread, however, is in __log_newfile() where I'm guessing it is running into trouble creating a new log file (or finding the next preallocated one) due to the ENOSPC condition:
#0 0x00007ff8e7343f37 in sched_yield () at ../sysdeps/unix/syscall-template.S:78 #1 0x00000000004a2d08 in __wt_yield () at ../src/os_posix/os_yield.c:25 #2 0x000000000047e9f1 in __log_newfile (session=session@entry=0x1b39330, conn_open=conn_open@entry=false, created=created@entry=0x7ff8e12a69cf) at ../src/log/log.c:1147 #3 0x000000000047ef5f in __wt_log_acquire (session=session@entry=0x1b39330, recsize=<optimized out>, slot=slot@entry=0x1b9d730) at ../src/log/log.c:1348 #4 0x0000000000488fba in __log_slot_new (session=0x1b39330) at ../src/log/log_slot.c:268 #5 __log_slot_switch_internal (session=session@entry=0x1b39330, myslot=myslot@entry=0x7ff8e12a6b50, forced=forced@entry=false, did_work=did_work@entry=0x0) at ../src/log/log_slot.c:364 #6 0x00000000004896f2 in __wt_log_slot_switch (session=session@entry=0x1b39330, myslot=myslot@entry=0x7ff8e12a6b50, retry=retry@entry=true, forced=forced@entry=false, did_work=did_work@entry=0x0) at ../src/log/log_slot.c:405 #7 0x000000000048442c in __log_write_internal (flags=0, lsnp=0x0, record=0x8495020, session=0x1b39330) at ../src/log/log.c:2673 #8 __wt_log_write (session=session@entry=0x1b39330, record=<optimized out>, lsnp=lsnp@entry=0x0, flags=0) at ../src/log/log.c:2569 #9 0x00000000004fa57a in __wt_txn_log_commit (session=session@entry=0x1b39330, cfg=cfg@entry=0x7ff8e12a6cb0) at ../src/txn/txn_log.c:325 #10 0x00000000004f30c1 in __wt_txn_commit (session=session@entry=0x1b39330, cfg=cfg@entry=0x7ff8e12a6cb0) at ../src/txn/txn.c:1390 #11 0x00000000004cf49a in __session_commit_transaction (wt_session=0x1b39330, config=<optimized out>) at ../src/session/session_api.c:1652 #12 0x00000000004114fe in commit_transaction (tinfo=tinfo@entry=0x3bb8100, prepared=prepared@entry=false) at ../../../test/format/ops.c:550 #13 0x0000000000413c71 in ops (arg=0x3bb8100) at ../../../test/format/ops.c:1086 #14 0x00007ff8e81646db in start_thread (arg=0x7ff8e12a7700) at pthread_create.c:463 #15 0x00007ff8e7361a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95