Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8035

Handle prefix enabled lexicographical comparisons for string key formats

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT10.0.1, 5.0.4, 4.4.10, 5.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • 5
    • Storage - Ra 2021-09-20

      A prefix search_near() call is a search_near() call configured with cursor configuration as prefix_key=true. The configuration allows the search_near() to exit early when the search reaches a key that is lexicographically greater than the intended set key.

      Prefix search_near() function calls becomes useful when keys are not all visible to the transaction, For example In a table of keys ranging from "aaa" - "zzz" of timestamp 50, and the search_near() of key "aa" operates on a timestamp 25. A normal search_near() function would traverse all keys in the tree, to return WT_NOTFOUND. When prefix_key=true is configured, the search_near() function will only look at key ranges in the tree from "aaa" - "aaz", reducing the number of traversals needed to return back WT_NOTFOUND.

      This code here shows how WT checks if a key in the tree is lexicographically greater than the intended set key:

               * If the cursor has prefix search configured we can early exit here if the key that we are
               * visiting is after our prefix.
              if (F_ISSET(&cbt->iface, WT_CURSTD_PREFIX_SEARCH) && prefix != NULL &&
                __wt_prefix_match(prefix, &cbt->iface.key) < 0) {
                  /* It is not okay for the user to have a custom collator. */
                  WT_ASSERT(session, CUR2BT(cbt)->collator == NULL);
                  WT_STAT_CONN_DATA_INCR(session, cursor_search_near_prefix_fast_paths);
                  return (WT_NOTFOUND);

      WT uses __wt_prefix_match to check the ordering of a particular key. When working on WT-7924, we discovered a WT_NOTFOUND error happening in our testing, when executing a prefix search_near() call. The symptoms of the original test is described as:
      Schema configuration: key_format=u,value_format=u

      1. Insert keys "aaa" - "aay" with timestamp 200
      2. Insert key "aaz" with timestamp 50
      3. With a transaction of read timestamp 100, search_near() for aa

      In this test, the behaviour of the search_near() successfully returns back "aaz" as the closest visible value to the transaction. The problem arises when the table configuration of key_format is set to "S" string. for some reason the __wt_prefix_match early exits at "aaa", and returns back WT_NOTFOUND. I have attached a reproducer, which reproduces 100% of the time.

            jie.chen@mongodb.com Jie Chen
            jie.chen@mongodb.com Jie Chen
            0 Vote for this issue
            2 Start watching this issue