Investigate if we can ensure we never fail after writing the disk image in reconciliation

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines, Storage Engines - Transactions
    • None
    • None

      In reconciliation, we can fail after we have written the disk image here:

          /*
           * If eviction didn't use any updates and didn't split or delete the page, it didn't make
           * progress. Give up rather than silently succeeding in doing no work: this way threads know to
           * back off forced eviction rather than spinning.
           *
           * Do not return an error if we are syncing the file with eviction disabled or as part of a
           * checkpoint.
           */
          if (ret == 0 && !(btree->evict_disabled > 0 || !F_ISSET(btree->dhandle, WT_DHANDLE_OPEN)) &&
            F_ISSET(r, WT_REC_EVICT) && !WT_PAGE_IS_INTERNAL(page) && r->multi_next == 1 &&
            F_ISSET(r, WT_REC_CALL_URGENT) && !r->update_used && r->cache_write_restore_invisible &&
            !r->cache_upd_chain_all_aborted) {
              /*
               * If leaf delta is enabled, we should have built an empty delta if this page has been
               * reconciled before as we don't make any progress.
               */
              WT_ASSERT(session,
                !WT_DELTA_ENABLED_FOR_PAGE(session, page->type) || F_ISSET(r, WT_REC_EMPTY_DELTA) ||
                  page->disagg_info->block_meta.page_id == WT_BLOCK_INVALID_PAGE_ID);
              /*
               * If eviction didn't make any progress, let application threads know they should refresh
               * the transaction's snapshot (and try to evict the latest content).
               */
              if (F_ISSET(session->txn, WT_TXN_HAS_SNAPSHOT))
                  F_SET(session->txn, WT_TXN_REFRESH_SNAPSHOT);
      
              WT_STAT_CONN_DSRC_INCR(session, cache_eviction_blocked_no_progress);
              ret = __wt_set_return(session, EBUSY);
          }
          addr = ref->addr;
      
          /*
           * Fail 1% of the time after we have built the disk image but before we wrap up reconciliation.
           */
          if (F_ISSET(r, WT_REC_EVICT) && !F_ISSET(r, WT_REC_EVICT_CALL_CLOSING) &&
            __wt_failpoint(session, WT_TIMING_STRESS_FAILPOINT_REC_BEFORE_WRAPUP, 100))
              ret = __wt_set_return(session, EBUSY);
      

       
      Investigate if we can remove all those failure cases to increase the robustness of reconciliation.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Chenhao Qu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: