Review WiredTiger reconciliation statistics for accuracy

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Security Level: Public (Available to anyone on the web)
    • Storage Engines - Transactions
    • 3,905.832
    • None
    • None

      Note: This may not be specific to disaggregated storage, but I noticed it when investigating disagg YCSB benchmarks, so I'm opening an SLS ticket.

      I was taking a look at some FTDC data from a throttled YCSB 100 update run, and the metrics confuse me:

      The gap between peaks on the checkpoint progress state metric is about 9 seconds. The metrics show that checkpoints are completing in about 150 milliseconds, which is confirmed by the shape of the checkpoint progress state graph. There are two things that surprise me:

      1) The checkpoint number of pages caused to be reconciled rate remains at about 140/second even when there isn't a checkpoint running.
      2) The checkpoint number of history store pages caused to be reconciled rate statistic is very similar to the non-history store statistic. My reading of that was that one or two non-history store pages are triggering reconciliation of ~130 history store pages.

      That led me to inspect the code in bt_sync.c, which looks like:

      341             WT_STAT_CONN_INCR(session, checkpoint_pages_reconciled);
      342             WT_STATP_DSRC_INCR(session, btree->dhandle->stats, btree_checkpoint_pages_reconciled);
      343             if (FLD_ISSET(rec_flags, WT_REC_HS))
      344                 WT_STAT_CONN_INCR(session, checkpoint_hs_pages_reconciled);
      345
      346             WT_ERR(__wt_reconcile(session, walk, NULL, rec_flags));
      

      The WT_REC_HS is a flag that allows reconciliation to write content back to the history store, not a flag that tracks how often pages are written back to the history store.

      Also: It seems surprising that the statistics are incremented prior to the reconciliation call. It's probably OK, since a failed reconciliation in checkpoint should generally result in a fatal error for the system.

            Assignee:
            Unassigned
            Reporter:
            Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: