Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2231

pinned page cursor searches could check parent keys

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.8.0
    • Labels:
      None
    • # Replies:
      15
    • Last comment by Customer:
      true

      Description

      Michael Cahill, Alexander Gorrod, we do a current-page check when a WT_CURSOR.search or WT_CURSOR.search-near operation is done on a cursor that's holding a page pinned, that is, if the cursor is holding a leaf page pinned, cursor search first checks if the search is directed at that leaf page before doing the more general full-tree search.

      I think it would be possible to check the leaf page's parent keys before doing the full binary search of the leaf page, avoiding the binary search entirely when the cursor is being re-positioned, at the cost of two additional searches when the cursor is not being re-positioned.

      To show the best case, I inserted a set of key/value pairs into 16KB leaf pages and then alternated cursor searches between two leaf pages. With a check of the parent page keys, the run is 31.38 seconds, without the check it's 36.02 seconds, a roughly 12% improvement.

      I used 16KB deliberately, and with smaller page sizes there's less improvement, as the comparisons of the parent page keys becomes a significant percentage of the comparisons done by the binary search. We could skip the test for small pages, of course.

      This change won't slow down searches where the cursor doesn't have a page pinned, but it will add two additional comparisons to searches of pinned pages that are going to succeed, where the key being searched for is on the page.

      Another note: to make this work, the search function has to look at the leaf page's parent key for itself, and for the subsequent key in the parent page's index. The former is safe, but the latter is a little trickier – I think it's safe, but, for example, that next WT_REF/WT_PAGE combination can split while we're looking at the key.

        Issue Links

          Activity

          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'}

          Message: WT-2231: this is a replay of the changes for the previous WT-2231 branch
          (see pull request #2325) for the individual commit details. There's an
          additional bug fix over and above those commits, fixing a clang sanitizer
          error.

          In summary: check the leaf page's parent keys before doing the full
          binary search of the leaf page, avoiding the binary search entirely
          when the cursor is being re-positioned, at the cost of two additional
          searches when the cursor is not being re-positioned. Additionally,
          do some work to improve WT_REF page hints.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/b62ee8ae0b4fd27a08bda89fd3ac537abe7f32f0

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'} Message: WT-2231 : this is a replay of the changes for the previous WT-2231 branch (see pull request #2325) for the individual commit details. There's an additional bug fix over and above those commits, fixing a clang sanitizer error. In summary: check the leaf page's parent keys before doing the full binary search of the leaf page, avoiding the binary search entirely when the cursor is being re-positioned, at the cost of two additional searches when the cursor is not being re-positioned. Additionally, do some work to improve WT_REF page hints. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/b62ee8ae0b4fd27a08bda89fd3ac537abe7f32f0
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'}

          Message: WT-2231: column search implementation of check the leaf page's parent
          keys before doing the full binary search.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/d89e7df83e87fa0c2cbd97059cac19f792854d47

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'} Message: WT-2231 : column search implementation of check the leaf page's parent keys before doing the full binary search. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/d89e7df83e87fa0c2cbd97059cac19f792854d47
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}

          Message: Merge pull request #2383 from wiredtiger/wt-2231-parent-key-check-number-2

          WT-2231: check the leaf page's parent keys before doing the full binary search
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/d5f487fdd9d40d2c8270cc3044b9d0ce2bc7671b

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'} Message: Merge pull request #2383 from wiredtiger/wt-2231-parent-key-check-number-2 WT-2231 : check the leaf page's parent keys before doing the full binary search Branch: develop https://github.com/wiredtiger/wiredtiger/commit/d5f487fdd9d40d2c8270cc3044b9d0ce2bc7671b
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

          Message: Import wiredtiger-wiredtiger-2.7.0-269-g44463c5.tar.gz from wiredtiger branch mongodb-3.4

          ref: 3c2ad56..44463c5

          SERVER-21833 Compact does not release space to the system with WiredTiger
          WT-2060 Simplify aggregation of statistics
          WT-2099 Seeing memory underflow messages
          WT-2113 truncate01 sometimes fails
          WT-2177 Add a per-thread seed to random number generator
          WT-2198 bulk load and column store appends
          WT-2231 pinned page cursor searches could check parent keys
          WT-2235 wt printlog option without unicode
          WT-2245 WTPERF Truncate has no ability to catch up when it falls behind
          WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations
          WT-2256 WTPERFs throttle option fires in bursts
          WT-2257 wtperf doesn't handle overriding workload config
          WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error
          WT-2260 Workloads evict internal pages unexpectedly
          WT-2262 Random sampling is skewed by tree shape
          WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h
          WT-2266 Add wtperf config to set if perf thresholds are fatal
          WT-2269 wtperf should dump its config everytime it runs
          WT-2272 Stress test assertion in the sweep server
          WT-2275 broken DB after application crash
          WT-2276 tool to decode checkpoint addr
          WT-2277 Remove WT check against big-endian systems
          WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined
          WT-2281 wtperf smoke.sh fails on ppc64le
          WT-2282 error in wt_txn_update_oldest verbose message test
          WT-2283 retry in txn_update_oldest results in a hang
          WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux
          WT-2289 failure in fast key check
          WT-2290 WT_SESSION.compact could be more effective.
          WT-2291 Random cursor walk inefficient in skip list only trees
          WT-2297 Fix off-by-one error in Huffman config file parsing
          WT-2299 upper-level WiredTiger code is reaching into the block manager
          WT-2301 Add reading a range to wtperf
          WT-2303 Build warning in wtperf
          WT-2304 wtperf crash dumping config
          WT-2307 Internal page splits can corrupt cursor iteration
          WT-2311 Support Sparc
          Branch: master
          https://github.com/mongodb/mongo/commit/d845b75e5f0837f801bdf371babd985308a1ad80

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'} Message: Import wiredtiger-wiredtiger-2.7.0-269-g44463c5.tar.gz from wiredtiger branch mongodb-3.4 ref: 3c2ad56..44463c5 SERVER-21833 Compact does not release space to the system with WiredTiger WT-2060 Simplify aggregation of statistics WT-2099 Seeing memory underflow messages WT-2113 truncate01 sometimes fails WT-2177 Add a per-thread seed to random number generator WT-2198 bulk load and column store appends WT-2231 pinned page cursor searches could check parent keys WT-2235 wt printlog option without unicode WT-2245 WTPERF Truncate has no ability to catch up when it falls behind WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations WT-2256 WTPERFs throttle option fires in bursts WT-2257 wtperf doesn't handle overriding workload config WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error WT-2260 Workloads evict internal pages unexpectedly WT-2262 Random sampling is skewed by tree shape WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h WT-2266 Add wtperf config to set if perf thresholds are fatal WT-2269 wtperf should dump its config everytime it runs WT-2272 Stress test assertion in the sweep server WT-2275 broken DB after application crash WT-2276 tool to decode checkpoint addr WT-2277 Remove WT check against big-endian systems WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined WT-2281 wtperf smoke.sh fails on ppc64le WT-2282 error in wt_txn_update_oldest verbose message test WT-2283 retry in txn_update_oldest results in a hang WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux WT-2289 failure in fast key check WT-2290 WT_SESSION.compact could be more effective. WT-2291 Random cursor walk inefficient in skip list only trees WT-2297 Fix off-by-one error in Huffman config file parsing WT-2299 upper-level WiredTiger code is reaching into the block manager WT-2301 Add reading a range to wtperf WT-2303 Build warning in wtperf WT-2304 wtperf crash dumping config WT-2307 Internal page splits can corrupt cursor iteration WT-2311 Support Sparc Branch: master https://github.com/mongodb/mongo/commit/d845b75e5f0837f801bdf371babd985308a1ad80
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

          Message: Import wiredtiger-wiredtiger-2.7.0-559-g07966a4.tar.gz from wiredtiger branch mongodb-3.2

          ref: 3c2ad56..07966a4

          WT-1517 schema format edge cases
          WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation
          WT-2060 Simplify aggregation of statistics
          WT-2073 metadata cleanups
          WT-2099 Seeing memory underflow messages
          WT-2113 truncate01 sometimes fails
          WT-2142 Connection cleanup in Python tests
          WT-2177 Add an optional per-thread seed to random number generator
          WT-2198 bulk load and column store appends
          WT-2216 simplify row-store search loop slightly
          WT-2225 New split code performance impact
          WT-2231 pinned page cursor searches could check parent keys
          WT-2235 wt printlog option without unicode
          WT-2242 WiredTiger treats dead trees the same as other trees in eviction
          WT-2244 Trigger in-memory splits sooner
          WT-2245 WTPERF Truncate has no ability to catch up when it falls behind
          WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations
          WT-2247 variable-length column-store in-memory page splits
          WT-2256 WTPERFs throttle option fires in bursts
          WT-2257 wtperf doesn't handle overriding workload config
          WT-2258 WiredTiger preloads pages even when direct-IO is configured.
          WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error
          WT-2260 Workloads evict internal pages unexpectedly
          WT-2262 Random sampling is skewed by tree shape
          WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h
          WT-2266 Add wtperf config to set if perf thresholds are fatal
          WT-2267 Improve wtperf throttling implementation to provide steady load
          WT-2269 wtperf should dump its config everytime it runs
          WT-2272 Stress test assertion in the sweep server
          WT-2275 broken DB after application crash
          WT-2276 tool to decode checkpoint addr
          WT-2277 Remove WT check against big-endian systems
          WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined
          WT-2281 wtperf smoke.sh fails on ppc64le
          WT-2282 error in wt_txn_update_oldest verbose message test
          WT-2283 retry in txn_update_oldest results in a hang
          WT-2284 Repeated macro definition
          WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux
          WT-2287 WT_SESSION.rebalance
          WT-2289 failure in fast key check
          WT-2290 WT_SESSION.compact could be more effective.
          WT-2291 Random cursor walk inefficient in skip list only trees
          WT-2295 WT_SESSION.create does a full-scan of the main table
          WT-2296 New log algorithm needs improving for sync/flush settings
          WT-2297 Fix off-by-one error in Huffman config file parsing
          WT-2299 upper-level WiredTiger code is reaching into the block manager
          WT-2301 Add reading a range to wtperf
          WT-2303 Build warning in wtperf
          WT-2304 wtperf crash dumping config
          WT-2305 Fix coverity scan issues on 23/12/2015
          WT-2307 Internal page splits can corrupt cursor iteration
          WT-2308 custom extractor for ref_cursors in join cursor
          WT-2311 Support Sparc
          WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree
          WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next
          WT-2314 page-swap error handling is inconsistent
          WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns
          WT-2320 Only check copyright when cutting releases
          WT-2321 WT-2321: race between eviction and worker threads on the eviction queue
          WT-2326 Change WTPERF to use new memory allocation functions instead of the standard
          WT-2328 schema drop does direct unlink, it should use a block manager interface.
          WT-2331 Checking of search() result for reference cursors before join()
          WT-2332 Bug in logging write-no-sync mode
          WT-2333 Add a flag so drop doesn't block
          WT-2335 NULL pointer crash in config_check_search with invalid configuration string
          WT-2338 Disable using pre-allocated log files when backup cursor is open
          WT-2339 format post-rebalance verify failure (stress run #11586)
          WT-2340 Add logging guarantee assertions, whitespace
          WT-2342 Enhance wtperf to support background create and drop operations
          WT-2344 OS X compiler warning
          WT-2347 Java: schema format edge cases
          WT-2348 xargs -P isn't portable
          WT-2355 Fix minor scratch buffer usage in logging
          SERVER-21833 Compact does not release space to the system with WiredTiger
          SERVER-21887 $sample takes disproportionately long time on newly created collection
          SERVER-22064 Coverity analysis defect 77699: Unchecked return value
          SERVER-21944 WiredTiger changes for 3.2.2
          Branch: v3.2
          https://github.com/mongodb/mongo/commit/5d6532f3d5227ff76f62c4810c98a4ef4d0c8c56

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'} Message: Import wiredtiger-wiredtiger-2.7.0-559-g07966a4.tar.gz from wiredtiger branch mongodb-3.2 ref: 3c2ad56..07966a4 WT-1517 schema format edge cases WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation WT-2060 Simplify aggregation of statistics WT-2073 metadata cleanups WT-2099 Seeing memory underflow messages WT-2113 truncate01 sometimes fails WT-2142 Connection cleanup in Python tests WT-2177 Add an optional per-thread seed to random number generator WT-2198 bulk load and column store appends WT-2216 simplify row-store search loop slightly WT-2225 New split code performance impact WT-2231 pinned page cursor searches could check parent keys WT-2235 wt printlog option without unicode WT-2242 WiredTiger treats dead trees the same as other trees in eviction WT-2244 Trigger in-memory splits sooner WT-2245 WTPERF Truncate has no ability to catch up when it falls behind WT-2246 column-store append searches the leaf page; the maximum record number fails CRUD operations WT-2247 variable-length column-store in-memory page splits WT-2256 WTPERFs throttle option fires in bursts WT-2257 wtperf doesn't handle overriding workload config WT-2258 WiredTiger preloads pages even when direct-IO is configured. WT-2259 __wt_evict_file_exclusive_on() should clear WT_BTREE_NO_EVICTION on error WT-2260 Workloads evict internal pages unexpectedly WT-2262 Random sampling is skewed by tree shape WT-2265 Wiredtiger related change in ppc64le specific code block in gcc.h WT-2266 Add wtperf config to set if perf thresholds are fatal WT-2267 Improve wtperf throttling implementation to provide steady load WT-2269 wtperf should dump its config everytime it runs WT-2272 Stress test assertion in the sweep server WT-2275 broken DB after application crash WT-2276 tool to decode checkpoint addr WT-2277 Remove WT check against big-endian systems WT-2279 Define WT_PAUSE(), WT_FULL_BARRIER(), etc when s390x is defined WT-2281 wtperf smoke.sh fails on ppc64le WT-2282 error in wt_txn_update_oldest verbose message test WT-2283 retry in txn_update_oldest results in a hang WT-2284 Repeated macro definition WT-2285 configure should set BUFFER_ALIGNMENT_DEFAULT to 4kb on linux WT-2287 WT_SESSION.rebalance WT-2289 failure in fast key check WT-2290 WT_SESSION.compact could be more effective. WT-2291 Random cursor walk inefficient in skip list only trees WT-2295 WT_SESSION.create does a full-scan of the main table WT-2296 New log algorithm needs improving for sync/flush settings WT-2297 Fix off-by-one error in Huffman config file parsing WT-2299 upper-level WiredTiger code is reaching into the block manager WT-2301 Add reading a range to wtperf WT-2303 Build warning in wtperf WT-2304 wtperf crash dumping config WT-2305 Fix coverity scan issues on 23/12/2015 WT-2307 Internal page splits can corrupt cursor iteration WT-2308 custom extractor for ref_cursors in join cursor WT-2311 Support Sparc WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next WT-2314 page-swap error handling is inconsistent WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns WT-2320 Only check copyright when cutting releases WT-2321 WT-2321 : race between eviction and worker threads on the eviction queue WT-2326 Change WTPERF to use new memory allocation functions instead of the standard WT-2328 schema drop does direct unlink, it should use a block manager interface. WT-2331 Checking of search() result for reference cursors before join() WT-2332 Bug in logging write-no-sync mode WT-2333 Add a flag so drop doesn't block WT-2335 NULL pointer crash in config_check_search with invalid configuration string WT-2338 Disable using pre-allocated log files when backup cursor is open WT-2339 format post-rebalance verify failure (stress run #11586) WT-2340 Add logging guarantee assertions, whitespace WT-2342 Enhance wtperf to support background create and drop operations WT-2344 OS X compiler warning WT-2347 Java: schema format edge cases WT-2348 xargs -P isn't portable WT-2355 Fix minor scratch buffer usage in logging SERVER-21833 Compact does not release space to the system with WiredTiger SERVER-21887 $sample takes disproportionately long time on newly created collection SERVER-22064 Coverity analysis defect 77699: Unchecked return value SERVER-21944 WiredTiger changes for 3.2.2 Branch: v3.2 https://github.com/mongodb/mongo/commit/5d6532f3d5227ff76f62c4810c98a4ef4d0c8c56

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 12 weeks, 4 days ago
                Date of 1st Reply: