Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Done
Fix Version/s: WT1.5.0
Affects Version/s: None
Component/s: None
Labels:
None

Sprint:
None
Story Points:
None

Hi guys,

I got some more info on why we hang when running the "small" config of LevelDB Bench with four or more threads.

To reproduce:

env LD_LIBRARY_PATH=../wt-dev-branch/build_posix/.libs:../wt-dev-branch/build_posix /ext/compressors/snappy/.libs/ TEST_TMPDIR= ./db_bench_wiredtiger --cache_size=6537216 --threads=4 --db=/tmpfs/leveldb --benchmarks=fillrandom,overwrite,readrandom

The benchmark goes fine through the first two phases (fillrandom and overwrite), but then goes into an infinite loop in readrandom.

Examining the stack trace gives me the following:

~~WT-1~~ 0x00007ffff795fdf9 in __wt_cache_full_check (session=0x6499b0) at ../src/include/cache.i:69
~~WT-2~~ __wt_page_in_func (session=0x6499b0, parent=0x7fffe4308a10, ref=0x7fffe4308ad0) at ../src/btree/bt_page.c:42
~~WT-3~~ 0x00007ffff7984e40 in __wt_row_search (session=0x6499b0, cbt=0x7fffe41d5ed0, is_modify=0) at ../src/btree/row_srch.c:174
~~WT-4~~ 0x00007ffff795301d in __wt_btcur_search (cbt=0x7fffe41d5ed0) at ../src/btree/bt_cursor.c:146
~~WT-5~~ 0x00007ffff798e052 in __curfile_search (cursor=0x7fffe41d5ed0) at ../src/cursor/cur_file.c:133
~~WT-6~~ 0x00007ffff7995b81 in __clsm_search (cursor=0x7fffe41caa50) at ../src/lsm/lsm_cursor.c:581
~~WT-7~~ 0x0000000000404077 in leveldb::Benchmark::ReadRandom(leveldb::(anonymous namespace)::ThreadState*) ()
~~WT-8~~ 0x000000000040897e in leveldb::Benchmark::ThreadBody(void*) ()
~~WT-9~~ 0x0000000000432a3a in leveldb::(anonymous namespace)::StartThreadWrapper(void*) ()
~~WT-10~~ 0x00007ffff6f07d86 in start_thread () from /lib64/libpthread.so.0
~~WT-11~~ 0x00007ffff6c4066d in clone () from /lib64/libc.so.6

I checked every thread, and every one of them is stuck in the same place in an infinite loop. If we look at the code where we loop, we see the following:

for (wake = 0;; wake = (wake + 1) % 100)

{ WT_RET(__wt_eviction_check(session, &lockout, wake == 0)); if (!lockout || F_ISSET(session, WT_SESSION_NO_CACHE_CHECK | WT_SESSION_SCHEMA_LOCKED)) return (0); if (F_ISSET(btree, WT_BTREE_BULK | WT_BTREE_NO_CACHE | WT_BTREE_NO_EVICTION)) return (0); if ((ret = __wt_evict_lru_page(session, 1)) == EBUSY) __wt_yield(); else WT_RET_NOTFOUND_OK(ret); }

What happens is that we drop all the way down to the last "else" statement, but don't return, because the ret value is actually WT_RET_NOTFOUND, so we stay in the loop forever. It appears that we can't find a page to evict.

is related to

WT-441 Allow LSM trees to discard the btree handle from the active chunk.

Closed

related to

WT-1 placeholder WT-1

Closed

WT-2 What does metadata look like?

Closed

WT-3 What file formats are required?

Closed

WT-4 Flexible cursor traversals

Closed

WT-5 How does pget work: is it necessary?

Closed

WT-6 Complex schema example

Closed

WT-7 Do we need the handle->err/errx methods?

Closed

WT-8 Do we need table load, bulk-load and/or dump methods?

Closed

WT-9 Does adding schema need to be transactional?

Closed

WT-10 Basic "getting started" tutorial

Closed

WT-11 placeholder #11

Closed

WT-443 Segfault in __wt_row_search

Closed

(8 related to)

Assignee:: Unassigned
Reporter:: Alexandra (Sasha) Fedorova
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Jan 24 2013 02:36:52 AM UTC
Updated:: Apr 16 2015 06:48:42 PM UTC
Resolved:: Apr 09 2015 01:06:56 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates