Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7912

Fix prefix search near optimisation to handle scenarios where the key range is split across pages.

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT10.0.1, 5.2.0, 5.0.5, 4.4.18
    • Affects Version/s: None
    • Component/s: None
    • None
    • 5
    • Storage - Ra 2021-11-01
    • v4.4

      When we search for a key using the prefix search near optimization implemented in WT-7264 we early exit the search if _cursor_row_next or _cursor_row_prev return WT_NOTFOUND. However those functions return WT_NOTFOUND when they step to the end or the start of the page. As such if the prefix key_range extends across multiple pages the search will incorrectly return WT_NOTFOUND when in fact we should have found a value.

      In order to fix this we'd need to differentiate between a prefix not found return and an end of page not found return. This could be implemented using a new WT error code.

      I've written a python reproducer to demonstrate the flaw in the current algorithm:

      import time, wiredtiger, wttest, unittest
      from wiredtiger import statdef timestamp_str(t):
          return '%x' % t# test_search_near01.py
      # Test various prefix search near scenarios.
      class test_search_near01(wttest.WiredTigerTestCase):
          conn_config = 'statistics=(all)'
          session_config = 'isolation=snapshot'    def get_stat(self, stat, local_session = None):
              if (local_session != None):
                  stat_cursor = local_session.open_cursor('statistics:')
              else:
                  stat_cursor = self.session.open_cursor('statistics:')
              val = stat_cursor[stat][2]
              stat_cursor.close()
              return val    def test_base_scenario(self):
              uri = 'table:test_base_scenario'
              self.session.create(uri, 'key_format=u,value_format=u')
              cursor = self.session.open_cursor(uri)
              session2 = self.conn.open_session()
              cursor3 = self.session.open_cursor(uri, None, "debug=(release_evict=true)")        # Basic character array.
              l = "abcdefghijklmnopqrstuvwxyz"        # Start our older reader.
              session2.begin_transaction()        # Insert very large values so we fill the page content with less than 26 keys.
              # The key range that we search for will be a minimum of 26 keys.
              key_value = "abcde" * 100000
              key_count = 26*26*26
              # Insert keys aaa -> zzz.
              for i in range (0, 26):
                  for j in range (0, 26):
                      for k in range (0, 26):
                          cursor[l[i] + l[j] + l[k]] = key_value        # Evict the whole range.
              for i in range (0, 26):
                  for j in range (0, 26):
                      for k in range (0, 26):
                          cursor3.set_key(l[i] + l[j] + l[k])
                          cursor3.search()
                          cursor3.reset()        # Search near for the "aa" part of the range.
              cursor2 = session2.open_cursor(uri)
              cursor2.set_key('aa')
              cursor2.search_near()        skip_count = self.get_stat(stat.conn.cursor_next_skip_lt_100)
              # This should be equal to roughly key_count * 2 as the aa range will be traversed forwards
              # And then backwards, this assertion failing demonstrates the issue.
              self.assertGreater(skip_count, key_count * 2) 

      In addition to fixing the code, the search near test created as part of WT-7924 should be executed with a more stressful configuration so keys are written across multiple pages. This configuration is a good start if not enough.

        1. repro_2.py
          5 kB
          Etienne Petrel

            Assignee:
            monica.ng@mongodb.com Monica Ng
            Reporter:
            luke.pearson@mongodb.com Luke Pearson
            Etienne Petrel
            Votes:
            0 Vote for this issue
            Watchers:
            26 Start watching this issue

              Created:
              Updated:
              Resolved: