Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2283

retry in txn_update_oldest results in a hang

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.8.0
    • Labels:
      None
    • # Replies:
      8
    • Last comment by Customer:
      true

      Description

      The retry I added in txn_update_oldest for WT-2113 is resulting in a hang when there are many threads. I see this in pmp:

            6 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search_near,__curfile_search_near,read_row,ops,start_thread,clone
            6 pthread_rwlock_wrlock,ops,start_thread,clone
            4 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search_near,__curfile_search_near,read_row,ops,start_thread,clone
            2 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__wt_txn_idle_cache_check,__cursor_func_init,__wt_btcur_update,__curfile_update,col_update,ops,start_thread,clone
            2 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__wt_txn_begin,__session_begin_transaction,ops,start_thread,clone
            2 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_update,__curfile_update,col_update,ops,start_thread,clone
            2 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,col_remove,ops,start_thread,clone
            2 
            1 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__wt_txn_idle_cache_check,__cursor_func_init,__wt_btcur_search,__curfile_search,col_remove,ops,start_thread,clone
            1 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_update,__curfile_update,col_update,ops,start_thread,clone
            1 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,read_row,ops,start_thread,clone
            1 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,col_remove,ops,start_thread,clone
            1 __wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__evict_lru_pages,__evict_server_work,__evict_pass,__evict_server,start_thread,clone
            1 __wt_atomic_subiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search_near,__curfile_search_near,read_row,ops,start_thread,clone
            1 __wt_atomic_casiv32,__wt_txn_update_oldest,__txn_checkpoint,__wt_txn_checkpoint,__session_checkpoint,ops,start_thread,clone
            1 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__wt_txn_idle_cache_check,__cursor_func_init,__wt_btcur_search,__curfile_search,col_remove,ops,start_thread,clone
            1 __wt_atomic_casiv32,__wt_txn_update_oldest,__evict_review,__wt_evict,__evict_page,__wt_cache_eviction_worker,__wt_cache_eviction_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,read_row,ops,start_thread,clone
      

        Activity

        Hide
        alexander.gorrod Alexander Gorrod added a comment - - edited

        It appears as though this change made a big difference to some of the Jenkins performance jobs as well. For example the evict-btree read throughput dropped from ~130 million to 50 million.

        Link: http://build.wiredtiger.com:8080/job/wiredtiger-perf-evict/plot/

        Show
        alexander.gorrod Alexander Gorrod added a comment - - edited It appears as though this change made a big difference to some of the Jenkins performance jobs as well. For example the evict-btree read throughput dropped from ~130 million to 50 million. Link: http://build.wiredtiger.com:8080/job/wiredtiger-perf-evict/plot/
        Hide
        xgen-internal-githook Githook User added a comment -

        Author:

        {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'}

        Message: WT-2283 Add yield to avoid tight looping.
        Branch: develop
        https://github.com/wiredtiger/wiredtiger/commit/9768cf6b90bfb0aeb448cfd3aa003cccbd02d218

        Show
        xgen-internal-githook Githook User added a comment - Author: {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'} Message: WT-2283 Add yield to avoid tight looping. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/9768cf6b90bfb0aeb448cfd3aa003cccbd02d218
        Hide
        xgen-internal-githook Githook User added a comment -

        Author:

        {u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'}

        Message: Merge pull request #2381 from wiredtiger/WT-2283-hang

        WT-2283 Add yield to avoid tight looping.
        Branch: develop
        https://github.com/wiredtiger/wiredtiger/commit/6af544c7ad19cb44594b0602831196651ada5026

        Show
        xgen-internal-githook Githook User added a comment - Author: {u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'} Message: Merge pull request #2381 from wiredtiger/ WT-2283 -hang WT-2283 Add yield to avoid tight looping. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/6af544c7ad19cb44594b0602831196651ada5026
        Hide
        xgen-internal-githook Githook User added a comment -

        Author:

        {u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'}

        Message: Merge pull request #2381 from wiredtiger/WT-2283-hang

        WT-2283 Add yield to avoid tight looping.
        Branch: develop
        https://github.com/wiredtiger/wiredtiger/commit/6af544c7ad19cb44594b0602831196651ada5026

        Show
        xgen-internal-githook Githook User added a comment - Author: {u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'} Message: Merge pull request #2381 from wiredtiger/ WT-2283 -hang WT-2283 Add yield to avoid tight looping. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/6af544c7ad19cb44594b0602831196651ada5026
        Hide
        sue.loverso Sue LoVerso added a comment - - edited

        It appears as though this change made a big difference to some of the Jenkins performance jobs as well. For example the evict-btree read throughput dropped from ~130 million to 50 million.

        Alexander Gorrod I am not convinced the change for WT-2113 is the culprit for the Jenkins plots. I ran medium-btree.wtperf on an older pre-WT-2113 changeset, bff6525c, and saw about 880K read ops/sec. In develop, changeset 6bd151a, I see about 190K read ops/sec. I see the same 190K ops/sec performance even if I replace wt_txn_update_oldest in develop with the earlier, pre-WT-2113 version of the function.

        Show
        sue.loverso Sue LoVerso added a comment - - edited It appears as though this change made a big difference to some of the Jenkins performance jobs as well. For example the evict-btree read throughput dropped from ~130 million to 50 million. Alexander Gorrod I am not convinced the change for WT-2113 is the culprit for the Jenkins plots. I ran medium-btree.wtperf on an older pre- WT-2113 changeset, bff6525c, and saw about 880K read ops/sec. In develop, changeset 6bd151a, I see about 190K read ops/sec. I see the same 190K ops/sec performance even if I replace wt_txn_update_oldest in develop with the earlier, pre- WT-2113 version of the function.
        Hide
        sue.loverso Sue LoVerso added a comment -

        FTR, the performance issue is in WT-2286.

        Show
        sue.loverso Sue LoVerso added a comment - FTR, the performance issue is in WT-2286 .
        Hide
        xgen-internal-githook Githook User added a comment -

        Author:

        {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

        Message: Import wiredtiger-wiredtiger-2.7.0-269-g44463c5.tar.gz from wiredtiger branch mongodb-3.4

        ref: 3c2ad56..44463c5

        SERVER-21833 Compact does not release space to the system with WiredTiger
        WT-2060 Simplify aggregation of statistics
        WT-2099 Seeing memory underflow messages
        WT-2113 truncate01 sometimes fails
        WT-2177 Add a per-thread seed to random number generator
        WT-2198 bulk load and column store appends
        WT-2231 pinned page cursor searches could check parent keys
        WT-2235 wt printlog option without unicode
        WT-2245 WTPERF Truncate has no ability to catch up when it falls behind
        WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations
        WT-2256 WTPERFs throttle option fires in bursts
        WT-2257 wtperf doesn't handle overriding workload config
        WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error
        WT-2260 Workloads evict internal pages unexpectedly
        WT-2262 Random sampling is skewed by tree shape
        WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h
        WT-2266 Add wtperf config to set if perf thresholds are fatal
        WT-2269 wtperf should dump its config everytime it runs
        WT-2272 Stress test assertion in the sweep server
        WT-2275 broken DB after application crash
        WT-2276 tool to decode checkpoint addr
        WT-2277 Remove WT check against big-endian systems
        WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined
        WT-2281 wtperf smoke.sh fails on ppc64le
        WT-2282 error in wt_txn_update_oldest verbose message test
        WT-2283 retry in txn_update_oldest results in a hang
        WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux
        WT-2289 failure in fast key check
        WT-2290 WT_SESSION.compact could be more effective.
        WT-2291 Random cursor walk inefficient in skip list only trees
        WT-2297 Fix off-by-one error in Huffman config file parsing
        WT-2299 upper-level WiredTiger code is reaching into the block manager
        WT-2301 Add reading a range to wtperf
        WT-2303 Build warning in wtperf
        WT-2304 wtperf crash dumping config
        WT-2307 Internal page splits can corrupt cursor iteration
        WT-2311 Support Sparc
        Branch: master
        https://github.com/mongodb/mongo/commit/d845b75e5f0837f801bdf371babd985308a1ad80

        Show
        xgen-internal-githook Githook User added a comment - Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'} Message: Import wiredtiger-wiredtiger-2.7.0-269-g44463c5.tar.gz from wiredtiger branch mongodb-3.4 ref: 3c2ad56..44463c5 SERVER-21833 Compact does not release space to the system with WiredTiger WT-2060 Simplify aggregation of statistics WT-2099 Seeing memory underflow messages WT-2113 truncate01 sometimes fails WT-2177 Add a per-thread seed to random number generator WT-2198 bulk load and column store appends WT-2231 pinned page cursor searches could check parent keys WT-2235 wt printlog option without unicode WT-2245 WTPERF Truncate has no ability to catch up when it falls behind WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations WT-2256 WTPERFs throttle option fires in bursts WT-2257 wtperf doesn't handle overriding workload config WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error WT-2260 Workloads evict internal pages unexpectedly WT-2262 Random sampling is skewed by tree shape WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h WT-2266 Add wtperf config to set if perf thresholds are fatal WT-2269 wtperf should dump its config everytime it runs WT-2272 Stress test assertion in the sweep server WT-2275 broken DB after application crash WT-2276 tool to decode checkpoint addr WT-2277 Remove WT check against big-endian systems WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined WT-2281 wtperf smoke.sh fails on ppc64le WT-2282 error in wt_txn_update_oldest verbose message test WT-2283 retry in txn_update_oldest results in a hang WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux WT-2289 failure in fast key check WT-2290 WT_SESSION.compact could be more effective. WT-2291 Random cursor walk inefficient in skip list only trees WT-2297 Fix off-by-one error in Huffman config file parsing WT-2299 upper-level WiredTiger code is reaching into the block manager WT-2301 Add reading a range to wtperf WT-2303 Build warning in wtperf WT-2304 wtperf crash dumping config WT-2307 Internal page splits can corrupt cursor iteration WT-2311 Support Sparc Branch: master https://github.com/mongodb/mongo/commit/d845b75e5f0837f801bdf371babd985308a1ad80
        Hide
        xgen-internal-githook Githook User added a comment -

        Author:

        {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

        Message: Import wiredtiger-wiredtiger-2.7.0-559-g07966a4.tar.gz from wiredtiger branch mongodb-3.2

        ref: 3c2ad56..07966a4

        WT-1517 schema format edge cases
        WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation
        WT-2060 Simplify aggregation of statistics
        WT-2073 metadata cleanups
        WT-2099 Seeing memory underflow messages
        WT-2113 truncate01 sometimes fails
        WT-2142 Connection cleanup in Python tests
        WT-2177 Add an optional per-thread seed to random number generator
        WT-2198 bulk load and column store appends
        WT-2216 simplify row-store search loop slightly
        WT-2225 New split code performance impact
        WT-2231 pinned page cursor searches could check parent keys
        WT-2235 wt printlog option without unicode
        WT-2242 WiredTiger treats dead trees the same as other trees in eviction
        WT-2244 Trigger in-memory splits sooner
        WT-2245 WTPERF Truncate has no ability to catch up when it falls behind
        WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations
        WT-2247 variable-length column-store in-memory page splits
        WT-2256 WTPERFs throttle option fires in bursts
        WT-2257 wtperf doesn't handle overriding workload config
        WT-2258 WiredTiger preloads pages even when direct-IO is configured.
        WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error
        WT-2260 Workloads evict internal pages unexpectedly
        WT-2262 Random sampling is skewed by tree shape
        WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h
        WT-2266 Add wtperf config to set if perf thresholds are fatal
        WT-2267 Improve wtperf throttling implementation to provide steady load
        WT-2269 wtperf should dump its config everytime it runs
        WT-2272 Stress test assertion in the sweep server
        WT-2275 broken DB after application crash
        WT-2276 tool to decode checkpoint addr
        WT-2277 Remove WT check against big-endian systems
        WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined
        WT-2281 wtperf smoke.sh fails on ppc64le
        WT-2282 error in wt_txn_update_oldest verbose message test
        WT-2283 retry in txn_update_oldest results in a hang
        WT-2284 Repeated macro definition
        WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux
        WT-2287 WT_SESSION.rebalance
        WT-2289 failure in fast key check
        WT-2290 WT_SESSION.compact could be more effective.
        WT-2291 Random cursor walk inefficient in skip list only trees
        WT-2295 WT_SESSION.create does a full-scan of the main table
        WT-2296 New log algorithm needs improving for sync/flush settings
        WT-2297 Fix off-by-one error in Huffman config file parsing
        WT-2299 upper-level WiredTiger code is reaching into the block manager
        WT-2301 Add reading a range to wtperf
        WT-2303 Build warning in wtperf
        WT-2304 wtperf crash dumping config
        WT-2305 Fix coverity scan issues on 23/12/2015
        WT-2307 Internal page splits can corrupt cursor iteration
        WT-2308 custom extractor for ref_cursors in join cursor
        WT-2311 Support Sparc
        WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree
        WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next
        WT-2314 page-swap error handling is inconsistent
        WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns
        WT-2320 Only check copyright when cutting releases
        WT-2321 WT-2321: race between eviction and worker threads on the eviction queue
        WT-2326 Change WTPERF to use new memory allocation functions instead of the standard
        WT-2328 schema drop does direct unlink, it should use a block manager interface.
        WT-2331 Checking of search() result for reference cursors before join()
        WT-2332 Bug in logging write-no-sync mode
        WT-2333 Add a flag so drop doesn't block
        WT-2335 NULL pointer crash in config_check_search with invalid configuration string
        WT-2338 Disable using pre-allocated log files when backup cursor is open
        WT-2339 format post-rebalance verify failure (stress run #11586)
        WT-2340 Add logging guarantee assertions, whitespace
        WT-2342 Enhance wtperf to support background create and drop operations
        WT-2344 OS X compiler warning
        WT-2347 Java: schema format edge cases
        WT-2348 xargs -P isn't portable
        WT-2355 Fix minor scratch buffer usage in logging
        SERVER-21833 Compact does not release space to the system with WiredTiger
        SERVER-21887 $sample takes disproportionately long time on newly created collection
        SERVER-22064 Coverity analysis defect 77699: Unchecked return value
        SERVER-21944 WiredTiger changes for 3.2.2
        Branch: v3.2
        https://github.com/mongodb/mongo/commit/5d6532f3d5227ff76f62c4810c98a4ef4d0c8c56

        Show
        xgen-internal-githook Githook User added a comment - Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'} Message: Import wiredtiger-wiredtiger-2.7.0-559-g07966a4.tar.gz from wiredtiger branch mongodb-3.2 ref: 3c2ad56..07966a4 WT-1517 schema format edge cases WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation WT-2060 Simplify aggregation of statistics WT-2073 metadata cleanups WT-2099 Seeing memory underflow messages WT-2113 truncate01 sometimes fails WT-2142 Connection cleanup in Python tests WT-2177 Add an optional per-thread seed to random number generator WT-2198 bulk load and column store appends WT-2216 simplify row-store search loop slightly WT-2225 New split code performance impact WT-2231 pinned page cursor searches could check parent keys WT-2235 wt printlog option without unicode WT-2242 WiredTiger treats dead trees the same as other trees in eviction WT-2244 Trigger in-memory splits sooner WT-2245 WTPERF Truncate has no ability to catch up when it falls behind WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations WT-2247 variable-length column-store in-memory page splits WT-2256 WTPERFs throttle option fires in bursts WT-2257 wtperf doesn't handle overriding workload config WT-2258 WiredTiger preloads pages even when direct-IO is configured. WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error WT-2260 Workloads evict internal pages unexpectedly WT-2262 Random sampling is skewed by tree shape WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h WT-2266 Add wtperf config to set if perf thresholds are fatal WT-2267 Improve wtperf throttling implementation to provide steady load WT-2269 wtperf should dump its config everytime it runs WT-2272 Stress test assertion in the sweep server WT-2275 broken DB after application crash WT-2276 tool to decode checkpoint addr WT-2277 Remove WT check against big-endian systems WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined WT-2281 wtperf smoke.sh fails on ppc64le WT-2282 error in wt_txn_update_oldest verbose message test WT-2283 retry in txn_update_oldest results in a hang WT-2284 Repeated macro definition WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux WT-2287 WT_SESSION.rebalance WT-2289 failure in fast key check WT-2290 WT_SESSION.compact could be more effective. WT-2291 Random cursor walk inefficient in skip list only trees WT-2295 WT_SESSION.create does a full-scan of the main table WT-2296 New log algorithm needs improving for sync/flush settings WT-2297 Fix off-by-one error in Huffman config file parsing WT-2299 upper-level WiredTiger code is reaching into the block manager WT-2301 Add reading a range to wtperf WT-2303 Build warning in wtperf WT-2304 wtperf crash dumping config WT-2305 Fix coverity scan issues on 23/12/2015 WT-2307 Internal page splits can corrupt cursor iteration WT-2308 custom extractor for ref_cursors in join cursor WT-2311 Support Sparc WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next WT-2314 page-swap error handling is inconsistent WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns WT-2320 Only check copyright when cutting releases WT-2321 WT-2321 : race between eviction and worker threads on the eviction queue WT-2326 Change WTPERF to use new memory allocation functions instead of the standard WT-2328 schema drop does direct unlink, it should use a block manager interface. WT-2331 Checking of search() result for reference cursors before join() WT-2332 Bug in logging write-no-sync mode WT-2333 Add a flag so drop doesn't block WT-2335 NULL pointer crash in config_check_search with invalid configuration string WT-2338 Disable using pre-allocated log files when backup cursor is open WT-2339 format post-rebalance verify failure (stress run #11586) WT-2340 Add logging guarantee assertions, whitespace WT-2342 Enhance wtperf to support background create and drop operations WT-2344 OS X compiler warning WT-2347 Java: schema format edge cases WT-2348 xargs -P isn't portable WT-2355 Fix minor scratch buffer usage in logging SERVER-21833 Compact does not release space to the system with WiredTiger SERVER-21887 $sample takes disproportionately long time on newly created collection SERVER-22064 Coverity analysis defect 77699: Unchecked return value SERVER-21944 WiredTiger changes for 3.2.2 Branch: v3.2 https://github.com/mongodb/mongo/commit/5d6532f3d5227ff76f62c4810c98a4ef4d0c8c56

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:
              Days since reply:
              1 year, 8 weeks, 1 day ago
              Date of 1st Reply: