Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-763

Bulk load slowed by tracking overflow items

    XMLWordPrintable

Details

    • Task
    • Status: Closed
    • Resolution: Fixed
    • None
    • WT1.6.6
    • None

    Description

      Hi Keith,

      I was looking at the test/format workload below because it is getting stuck with the cache full, but I noticed something else: bulk load gets slower over time. The reason is that __wt_ovfl_reuse_search is doing more and more work: we seem to be tracking all the overflow items?

      Here is the config:

      ############################################
      #  RUN PARAMETERS
      ############################################
      firstfit=1
      # bitcnt not applicable to this run
      cache=3
      compression=none
      data_extend=0
      data_source=file
      delete_pct=25
      dictionary=0
      file_type=row-store
      hot_backups=0
      huffman_key=0
      huffman_value=0
      insert_pct=23
      internal_key_truncation=0
      internal_page_max=15
      key_gap=18
      key_max=112
      key_min=11
      leaf_page_max=10
      ops=100000
      prefix=1
      repeat_data_pct=36
      reverse=0
      rows=100000
      runs=100
      split_pct=61
      statistics=1
      threads=16
      value_max=896
      value_min=4
      # wiredtiger_config not applicable to this run
      write_pct=46
      ############################################
      

      And here is a simple workaround:

      diff --git a/src/btree/rec_write.c b/src/btree/rec_write.c
      --- a/src/btree/rec_write.c
      +++ b/src/btree/rec_write.c
      @@ -30,6 +30,8 @@ typedef struct {
       	uint32_t orig_write_gen;
       	int	 upd_skipped;		/* Skipped a page's update */
       
      +	int	 bulk;			/* Is this a bulk load? */
      +
       	/*
       	 * Raw compression (don't get me started, as if normal reconciliation
       	 * wasn't bad enough).  If an application wants absolute control over
      @@ -1993,6 +1995,7 @@ int
       
       	WT_RET(__rec_write_init(session, page, 0, &cbulk->reconcile));
       	r = cbulk->reconcile;
      +	r->bulk = 1;
       
       	switch (btree->type) {
       	case BTREE_COL_FIX:
      @@ -4481,8 +4484,9 @@ static int
       		WT_ERR(__wt_bt_write(session, tmp, addr, &size, 0, 0));
       
       		/* Track the overflow record. */
      -		WT_ERR(__wt_ovfl_reuse_add(session, page,
      -		    addr, size, kv->buf.data, kv->buf.size));
      +		if (!r->bulk)
      +			WT_ERR(__wt_ovfl_reuse_add(session, page,
      +			    addr, size, kv->buf.data, kv->buf.size));
       	}
       
       	/* Set the callers K/V to reference the overflow record's address. */
      

      Can you please take a look?

      Attachments

        Activity

          People

            keith.bostic@mongodb.com Keith Bostic (Inactive)
            michael.cahill@mongodb.com Michael Cahill
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: