[SERVER-21767] Index sizes differ significantly when different block compression is specified Created: 04/Dec/15  Updated: 13/Jun/16  Resolved: 12/Jun/16

Status: Closed
Project: Core Server
Component/s: Storage, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Asya Kamsky Assignee: Keith Bostic (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to WT-2671 dump more information about the file ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

git clone https://github.com/asya999/iibench-mongodb.git
cd iibench-mongodb
sed -i "s/export NUM_LOADER_THREADS=.*/export NUM_LOADER_THREADS=20/" run.simple.bash
sed -i "s/export MAX_ROWS=.*/export MAX_ROWS=300000000/" run.simple.bash
./run.simple.bash

Participants:

 Description   

Running

../mongo-latest/bin/mongod --version
db version v3.2.0-rc4-21-g86e7b69
git version: 86e7b69a6c52c926d28a60d816faefa6db81eb96
OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014
allocator: tcmalloc
modules: enterprise
build environment:
    distmod: ubuntu1404
    distarch: x86_64
    target_arch: x86_64


which I believe is equivalent to what ended up being 3.2.0-rc5 I'm observing the following at the end of a run that inserts 200M documents:

When server is started with default (snappy) block compression, the stats at the end of the run are approximately:

	"db" : "iibench",
	"collections" : 1,
	"objects" : 200000000,
	"avgObjSize" : 1119,
	"dataSize" : 208.4299921989441,
	"storageSize" : 51.31330490112305,
	"numExtents" : 0,
	"indexes" : 4,
	"indexSize" : 22.347862243652344,
	"ok" : 1

When the same run is done with block compression zlib, the stats are:

	"db" : "iibench",
	"collections" : 1,
	"objects" : 200000000,
	"avgObjSize" : 1119,
	"dataSize" : 208.4299921989441,
	"storageSize" : 21.949535369873047,
	"numExtents" : 0,
	"indexes" : 4,
	"indexSize" : 2.2369346618652344,
	"ok" : 1

On longer run the difference was only 2x but I would not have expected any size difference since we don't apply block compression to indexes.



 Comments   
Comment by Keith Bostic (Inactive) [ 12/Jun/16 ]

I've looked at this pretty carefully now, and I don't think there are any unexpected behaviors here.

The short version is the actual data size in the index-5 file is roughly 4GB, and the data rates for this index file are such that it's not unlikely for a significant percentage of the file's blocks to be dirtied between checkpoints, which means a real data to file size ratio of 2-3x isn't unreasonable (which is what I'm seeing). I've verified the number of blocks being written between and during checkpoints, and verified that in-use blocks appear at the end of the file (which prevents any compaction of the file). There are probably changes we could make to improve the likelihood of the file being compacted (for example, a longer pause between checkpoints makes things much better, or even adjusting when we switch to/from best-fit and first-fit allocation as blocks are chosen from the freelist), but as it stands, this behavior isn't unexpected.

I'm going to go ahead and close this one.

Comment by Alexander Gorrod [ 04/Dec/15 ]

This is interesting asya. I was able to reproduce without an oplog.

I don't think there is a problem with the volume of data being inserted. If I compare using a standalone WiredTiger utility, the indexes contain the right information.

I've also noticed that if I re-open the files, the sizes shrink down to be similar. I suspect that there are timing differences between when running with zlib and snappy that mean we are keeping a different volume of checkpoint data pinned in the index files - which causes the file size to vary.

It's worth investigating, but we aren't applying block compression to the indexes or storing invalid data. So I don't think this is a high priority.

FWIW my process to reproduce is:

$ cat s21767.conf
processManagement:
    fork: true
systemLog:
    destination: file
    path: "/home/alexg/work/mongo/data/oplog-27017.log"
storage:
    dbPath: "/home/alexg/work/mongo/data/db/"
    engine: "wiredTiger"
    wiredTiger:
        collectionConfig:
           blockCompressor: "zlib"
        engineConfig:
           cacheSizeGB: 12
$ rm -rf ./data/db/* && ./mongod --config=./s21767.conf
$ (cd ../iibench-mongodb && sh ./run.simple.bash)

I then can attach to the mongod in a shell and run:

> use iibench
switched to db iibench
> show collections
purchases_index
> db.purchases_index.stats()

Comment by Asya Kamsky [ 04/Dec/15 ]

The sizes of index files in data directory correspond to what we output in stats - I'm running another test now, but I had checked that the size of files matched before.

These tests are all with oplog on, but I saw similar diffs without oplog.

My mongodb config looks like this (zlib example):

processManagement:
    fork: true
systemLog:
    destination: file
    path: "/home/asya/logs/oplog-27017.log"
storage:
    dbPath: "/disk1/db/oplog-27017"
    engine: "wiredTiger"
    wiredTiger:
        collectionConfig:
           blockCompressor: "zlib"
        engineConfig:
           cacheSizeGB: 24
replication:
    oplogSizeMB: 2000
    replSetName: "oplog"

Journal is on a separate disk from data files. Both disks are ext4 and not xfs.

Comment by Alexander Gorrod [ 04/Dec/15 ]

Thanks asya Could you include the command lines/configuration files you are using for mongod to generate the different results?

It would also be helpful if you can upload a sample output database directory - it could be from a very small run.

Comment by Michael Cahill (Inactive) [ 04/Dec/15 ]

asya, thanks for opening a ticket. Can you include ls -l output for the database directory? I'm curious about whether we're chasing how the index is written, or how "indexSize" is calculated...

Generated at Thu Feb 08 03:58:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.