-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Security Level: Public (Available to anyone on the web)
-
Storage Engines - Transactions
-
3,905.832
-
None
-
None
Note: This may not be specific to disaggregated storage, but I noticed it when investigating disagg YCSB benchmarks, so I'm opening an SLS ticket.
I was taking a look at some FTDC data from a throttled YCSB 100 update run, and the metrics confuse me:

The gap between peaks on the checkpoint progress state metric is about 9 seconds. The metrics show that checkpoints are completing in about 150 milliseconds, which is confirmed by the shape of the checkpoint progress state graph. There are two things that surprise me:
1) The checkpoint number of pages caused to be reconciled rate remains at about 140/second even when there isn't a checkpoint running.
2) The checkpoint number of history store pages caused to be reconciled rate statistic is very similar to the non-history store statistic. My reading of that was that one or two non-history store pages are triggering reconciliation of ~130 history store pages.
That led me to inspect the code in bt_sync.c, which looks like:
341 WT_STAT_CONN_INCR(session, checkpoint_pages_reconciled);
342 WT_STATP_DSRC_INCR(session, btree->dhandle->stats, btree_checkpoint_pages_reconciled);
343 if (FLD_ISSET(rec_flags, WT_REC_HS))
344 WT_STAT_CONN_INCR(session, checkpoint_hs_pages_reconciled);
345
346 WT_ERR(__wt_reconcile(session, walk, NULL, rec_flags));
The WT_REC_HS is a flag that allows reconciliation to write content back to the history store, not a flag that tracks how often pages are written back to the history store.
Also: It seems surprising that the statistics are incremented prior to the reconciliation call. It's probably OK, since a failed reconciliation in checkpoint should generally result in a fatal error for the system.
- is depended on by
-
WT-14427 [ds-09.04][Storage Engines (Core)] 100% hygiene plan execution
-
- Open
-