Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Done
Fix Version/s: WT2.2
Affects Version/s: None
Component/s: None
Labels:
- Bug

Sprint:
None
Story Points:
None

@keithbostic, the test/format config below shows some pathological behavior with zlib raw compression. There are a few interconnected issues:

this config has 512 byte pages, with overflow keys and values;
the page header takes 64 bytes, zlib's header ~12, so there is only 436 bytes available on the page for a compressed image;
the first slot the btree layer tells raw compression about is at 449 bytes;
if raw compression doesn't take any bytes, it is called again with more, even if it doesn't return EAGAIN (this changed in e737009);
eviction here is being triggered by a page growing past 5MB, and after many failed calls, the whole 5MB page is eventually written. Next time it is read in and modified, forced eviction is triggered and the whole thing starts again.

I'd suggest two changes – one is to include some slots before the first allocation size (see patch below). The other is to only repeat the call to raw compression if EAGAIN is returned, but that apparently causes problems figuring out the next key for row store splits.

Here is the CONFIG:

############################################
#  RUN PARAMETERS
############################################
auto_throttle=1
firstfit=0
# bitcnt not applicable to this run
bloom=1
bloom_bit_count=47
bloom_hash_count=26
bloom_oldest=0
cache=56
checkpoints=1
checksum=uncompressed
chunk_size=4
compaction=0
compression=zlib
data_extend=0
data_source=file
delete_pct=13
dictionary=0
file_type=row-store
hot_backups=0
huffman_key=0
huffman_value=0
insert_pct=32
internal_key_truncation=1
internal_page_max=9
key_gap=16
key_max=256
key_min=256
leaf_page_max=9
logging=0
merge_max=7
merge_threads=3
mmap=1
ops=100000
prefix_compression=1
prefix_compression_min=7
repeat_data_pct=90
reverse=0
rows=100000
runs=1
split_pct=72
statistics=1
threads=1
value_max=1699
value_min=256
# wiredtiger_config not applicable to this run
write_pct=43
############################################

And here is half of the proposed change:

Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml

diff --git a/src/btree/rec_write.c b/src/btree/rec_write.c
--- a/src/btree/rec_write.c
+++ b/src/btree/rec_write.c
@@ -1945,10 +1945,10 @@ static int
 		 * We can't compress the first 64B of the block (it must be
 		 * written without compression), and a possible split point
 		 * may appear in that 64B; keep it simple, ignore the first
-		 * allocation size of data, anybody splitting smaller than
+		 * half allocation size of data, anybody splitting smaller than
 		 * that (as calculated before compression), is doing it wrong.
 		 */
-		if ((len = WT_PTRDIFF(cell, dsk)) > btree->allocsize)
+		if ((len = WT_PTRDIFF(cell, dsk)) > btree->allocsize / 2)
 			r->raw_offsets[++slots] =
 			    WT_STORE_SIZE(len - WT_BLOCK_COMPRESS_SKIP);
 
@@ -1959,10 +1959,11 @@ static int
 	}
 
 	/*
-	 * If we haven't managed to find at least one split point, we're done,
-	 * don't bother calling the underlying compression function.
+	 * If we haven't managed to find at least one split point, or all of
+	 * the rows fit into a single block, we're done, don't bother calling
+	 * the underlying compression function.
 	 */
-	if (slots == 0) {
+	if (slots == 0 || len <= btree->allocsize) {
 		result_slots = 0;
 		goto no_slots;
 	}

related to

WT-1008 Raw compression fixes

Closed

WT-1013 Compression can fail

Closed

Assignee:: Keith Bostic (Inactive)
Reporter:: Michael Cahill (Inactive)
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: May 12 2014 07:19:43 AM UTC
Updated:: Oct 27 2021 06:39:14 PM UTC
Resolved:: Apr 09 2015 01:08:06 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates