Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4921

Add debug mode option that slows checkpoint creation

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT3.2.2, 4.3.3, 4.2.3
    • Component/s: None
    • Labels:
    • Story Points:
      5
    • Sprint:
      Storage Engines 2019-12-16, Storage Engines 2019-12-30

      Description

      We should add a new debug_mode=[slow_checkpoint] configuration option to wiredtiger_open that causes checkpoints to be created more slowly. We could do that by adding in a small sleep each time checkpoint visits an internal page:

      --- a/src/btree/bt_sync.c
      +++ b/src/btree/bt_sync.c
      @@ -300,6 +300,8 @@ __wt_sync_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop)
                              if (WT_PAGE_IS_INTERNAL(page)) {
                                      internal_bytes += page->memory_footprint;
                                      ++internal_pages;
      +                               /* Slow down checkpoints */
      +                               __wt_sleep(0, 10000);
                              } else {
                                      leaf_bytes += page->memory_footprint;
                                      ++leaf_pages;
      

      It would be interesting to do this while running a workload with an expected level of stable throughput, and observing the consequences of having a checkpoint run for a long time on that workload. An example of a wtperf workload that would be interesting is:

      conn_config="cache_size=1GB,session_max=1000,eviction=(threads_min=8,threads_max=8),log=(enabled=false),transaction_sync=(enabled=false),checkpoint_sync=false,checkpoint=(wait=10)"
      table_config="allocation_size=1024,memory_page_max=30MB,prefix_compression=false,split_pct=90,leaf_page_max=32k,internal_page_max=1024,type=file"
      # About 2.5 GB of data - more than fits in cache.
      icount=25000000
      table_count=3
      log_like_table=true
      report_interval=5
      run_time=120
      pareto=10
      populate_threads=1
      threads=((count=2,updates=1,throttle=5000),(count=4,reads=1),(count=1,reads=1,read_range=100000,throttle=1))
      # Add throughput/latency monitoring
      max_latency=2000
      sample_interval=5
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jeremy.tay Jeremy Tay (Inactive)
              Reporter:
              alexander.gorrod Alexander Gorrod
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: