Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1007

Pathological behavior with raw compression

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.2
    • Affects Version/s: None
    • Component/s: None

      @keithbostic, the test/format config below shows some pathological behavior with zlib raw compression. There are a few interconnected issues:

      • this config has 512 byte pages, with overflow keys and values;
      • the page header takes 64 bytes, zlib's header ~12, so there is only 436 bytes available on the page for a compressed image;
      • the first slot the btree layer tells raw compression about is at 449 bytes;
      • if raw compression doesn't take any bytes, it is called again with more, even if it doesn't return EAGAIN (this changed in e737009);
      • eviction here is being triggered by a page growing past 5MB, and after many failed calls, the whole 5MB page is eventually written. Next time it is read in and modified, forced eviction is triggered and the whole thing starts again.

      I'd suggest two changes – one is to include some slots before the first allocation size (see patch below). The other is to only repeat the call to raw compression if EAGAIN is returned, but that apparently causes problems figuring out the next key for row store splits.

      Here is the CONFIG:

      ############################################
      #  RUN PARAMETERS
      ############################################
      auto_throttle=1
      firstfit=0
      # bitcnt not applicable to this run
      bloom=1
      bloom_bit_count=47
      bloom_hash_count=26
      bloom_oldest=0
      cache=56
      checkpoints=1
      checksum=uncompressed
      chunk_size=4
      compaction=0
      compression=zlib
      data_extend=0
      data_source=file
      delete_pct=13
      dictionary=0
      file_type=row-store
      hot_backups=0
      huffman_key=0
      huffman_value=0
      insert_pct=32
      internal_key_truncation=1
      internal_page_max=9
      key_gap=16
      key_max=256
      key_min=256
      leaf_page_max=9
      logging=0
      merge_max=7
      merge_threads=3
      mmap=1
      ops=100000
      prefix_compression=1
      prefix_compression_min=7
      repeat_data_pct=90
      reverse=0
      rows=100000
      runs=1
      split_pct=72
      statistics=1
      threads=1
      value_max=1699
      value_min=256
      # wiredtiger_config not applicable to this run
      write_pct=43
      ############################################
      

      And here is half of the proposed change:

      Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      diff --git a/src/btree/rec_write.c b/src/btree/rec_write.c
      --- a/src/btree/rec_write.c
      +++ b/src/btree/rec_write.c
      @@ -1945,10 +1945,10 @@ static int
       		 * We can't compress the first 64B of the block (it must be
       		 * written without compression), and a possible split point
       		 * may appear in that 64B; keep it simple, ignore the first
      -		 * allocation size of data, anybody splitting smaller than
      +		 * half allocation size of data, anybody splitting smaller than
       		 * that (as calculated before compression), is doing it wrong.
       		 */
      -		if ((len = WT_PTRDIFF(cell, dsk)) > btree->allocsize)
      +		if ((len = WT_PTRDIFF(cell, dsk)) > btree->allocsize / 2)
       			r->raw_offsets[++slots] =
       			    WT_STORE_SIZE(len - WT_BLOCK_COMPRESS_SKIP);
       
      @@ -1959,10 +1959,11 @@ static int
       	}
       
       	/*
      -	 * If we haven't managed to find at least one split point, we're done,
      -	 * don't bother calling the underlying compression function.
      +	 * If we haven't managed to find at least one split point, or all of
      +	 * the rows fit into a single block, we're done, don't bother calling
      +	 * the underlying compression function.
       	 */
      -	if (slots == 0) {
      +	if (slots == 0 || len <= btree->allocsize) {
       		result_slots = 0;
       		goto no_slots;
       	}
      

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: