Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3299

slowdown with many pinned updates to a single key

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None

      Large numbers of updates to a single key in the presence of a snapshot (or any old reader), can significantly slow WiredTiger. Here's the test:

          def test_las_update(self):
              # Create a small table.
              uri = "table:test_las_update"
              nrows = 100
              ds = SimpleDataSet(self, uri, nrows, key_format="S")
              ds.populate()
      
              # Take a snapshot.
              self.session.snapshot("name=xxx")
      
              # Update a record a large number of times, we'll hang if the lookaside
              # table isn't doing its thing.
              c = self.session.open_cursor(uri)
              bigvalue = "a" * 100
              for i in range(1, 50000):
                  if i % 1000 == 0:
                          self.tty(str(i))
                  c.set_key(ds.key(nrows + 1))
                  c.set_value(bigvalue)
                  self.assertEquals(c.insert(), 0)
      

      The large number of updates on a single page eventually triggers forced eviction, but because there's only a single "insert" on the page, we don't trigger an in-memory split, and we fall through to normal reconciliation, which returns EBUSY plus a lookaside table recommendation.

      The lookaside table isn't currently called because the cache isn't "stuck". Adding a check on forced eviction fixes that problem, and we can evict the page.

      diff --git a/src/btree/bt_read.c b/src/btree/bt_read.c
      index 72a69e859..358f433ad 100644
      --- a/src/btree/bt_read.c
      +++ b/src/btree/bt_read.c
      @@ -292,11 +292,11 @@ err:      WT_TRET(__wt_las_cursor_close(session, &cursor, session_flags));
       }
       
       /*
      - * __evict_force_check --
      + * __wt_evict_force_check --
        *     Check if a page matches the criteria for forced eviction.
        */
      -static bool
      -__evict_force_check(WT_SESSION_IMPL *session, WT_REF *ref)
      +bool
      +__wt_evict_force_check(WT_SESSION_IMPL *session, WT_REF *ref)
       {
              WT_BTREE *btree;
              WT_PAGE *page;
      @@ -595,7 +595,7 @@ __wt_page_in_func(WT_SESSION_IMPL *session, WT_REF *ref, uint32_t flags
                               * Forcibly evict pages that are too big.
                               */
                              if (force_attempts < 10 &&
      -                           __evict_force_check(session, ref)) {
      +                           __wt_evict_force_check(session, ref)) {
                                      ++force_attempts;
                                      ret = __wt_page_release_evict(session, ref);
                                      /* If forced eviction fails, stall. */
      diff --git a/src/evict/evict_page.c b/src/evict/evict_page.c
      index edcd108e7..3a6ba0591 100644
      --- a/src/evict/evict_page.c
      +++ b/src/evict/evict_page.c
      @@ -547,7 +547,9 @@ __evict_review(
               * lookaside table, allowing the eviction of pages we'd otherwise have
               * to retain in cache to support older readers.
               */
      -       if (ret == EBUSY && __wt_cache_stuck(session) && lookaside_retry) {
      +       if (ret == EBUSY && lookaside_retry &&
      +           (__wt_cache_stuck(session) ||
      +           __wt_evict_force_check(session, ref))) {
                      LF_CLR(WT_EVICT_SCRUB | WT_EVICT_UPDATE_RESTORE);
                      LF_SET(WT_EVICT_LOOKASIDE);
                      ret = __wt_reconcile(session, ref, NULL, flags, NULL);
      

      However, the next update to the page reads the page back into memory, the lookaside table is read, and the too-big update chain re-instantiated in memory, and we're right back where we started.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: