As seen in this run: http://build.wiredtiger.com:8080/job/wiredtiger-test-unit/6791/console
We end up with all internal threads sleeping and the checkpoint blocked:
9 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait_signal 1 __wt_gen,__wt_session_gen_enter,__wt_hazard_check,__page_read,__wt_page_in_func, __wt_page_swap_func,__tree_walk_internal,__wt_tree_walk,__sync_file,__wt_cache_op, __checkpoint_tree,__checkpoint_tree_helper,__checkpoint_apply,__txn_checkpoint, __txn_checkpoint_wrapper,__wt_txn_checkpoint,__session_checkpoint, _wrap_Session_checkpoint,PyEval_EvalFrameEx,...
What's happening is that the checkpoint is trying to read a page in the WT_REF_LIMBO state. But the eviction server thread has a hazard pointer on that page, preventing checkpoint from making progress.
We already have the eviction server take care not to stop on certain pages. This seems like another case we need to handle there.
- is duplicated by
-
WT-3843 Unit test job on Windows failed due to time out.
- Closed