-
Type: Task
-
Resolution: Done
-
Affects Version/s: None
-
Component/s: None
When running the wtperf
Unable to find source-code formatter for language: bench. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
/wtperf/runners/evict-btree.wtperf
workload, some read latencies are higher than would be ideal. The long run time isn't tied to disk I/O.
fedorova has done some analysis of the slow operations:
I caught a real straggler for you: one that took > 100ms to complete. And it has a really interesting stack trace. Because the trace is very long, it got truncated, but you'll get the idea of what's going on. Lots of memory allocations! Hope this helps...
__curfile_search--> __wt_btcur_search--> __wt_txn_release_snapshot--> __wt_txn_release_snapshot<--__wt_hazard_clear--> __wt_hazard_clear<--__wt_txn_refresh--> __txn_sort_snapshot--> __txn_sort_snapshot<--__wt_row_search--> __wt_page_in_func--> __wt_cache_read--> __wt_bt_read--> __wt_bm_read--> __wt_block_buffer_to_addr--> __block_buffer_to_addr.isra.1--> __block_buffer_to_addr.isra.1<--__wt_block_buffer_to_addr<--__wt_block_read_off--> __wt_buf_grow_worker--> __wt_realloc_aligned--> posix_memalign--> __libc_memalign--> mprotect--> mprotect<--__libc_memalign<--posix_memalign<--__wt_realloc_aligned<--__wt_buf_grow_worker<--__wt_read-->pread--> __pthread_enable_asynccancel--> __pthread_enable_asynccancel<--__pthread_disable_asynccancel--> __pthread_disable_asynccancel<--pread<--__pread_nocancel<--__wt_read<--__wt_cksum-->__wt_cksum_sw--> __wt_cksum_sw<--__wt_block_read_off<--__wt_bm_read<--__wt_bt_read<--__wt_page_inmem-->__wt_page_alloc--> __wt_calloc--> calloc--> calloc<--__wt_calloc<--__wt_calloc--> calloc-->mprotect-->mprotect<--calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> calloc-->calloc<--__wt_calloc<--__wt_calloc--> ...
This is a surprise - since that test code uses default page sizes (4k internal, 32k leaf), reasonable key/value sizes and no compression. I don't know why reading a page in is causing so much memory allocation.