[SERVER-7345] Performance regression related to IO in v2.2 Created: 13/Oct/12  Updated: 10/Dec/14  Resolved: 16/Jan/14

Status: Closed
Project: Core Server
Component/s: Performance, Storage
Affects Version/s: 2.2.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Becker Assignee: Ben Becker
Resolution: Cannot Reproduce Votes: 1
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux VM, 768MB of memory
Single mongod instance


Attachments: File generate_random.js     PDF File profile_768mb_insert_281mb.pdf     PDF File profile_inserts_slow_200mb_cache_v20.pdf    
Operating System: Linux
Participants:

 Description   

The attached test script completes in ~15 minutes on mongod v2.2, and completes in ~11 minutes with v2.0. The script can become the bottleneck, so reproducing may require launching multiple instances concurrently (simply comment out L93).

The test machine has 768mb of memory, of which ~300mb is utilized for the fs cache in both runs.

Example Document:

> db.dataset.findOne()
{
	"_id" : ObjectId("5062a296e75c81672a1570b9"),
	"strField" : "9299017299123617958285718174127162485897332625276264340605403374852726304759655770059541029076721100",
	"nestedDoc" : {
		"a" : 1,
		"b" : 2,
		"c" : 3
	},
	"simpleArray" : [
		"3635064237454641432629410894264391352434895832887689305",
		"99313732974949308656883093002705731806589631279783927524541601897163947995768031270534611514086295889389716258260029207957719673376767857540696625853790210607523268303277996566246106065697713472168809865745090998788927444846988466485369444659935485219233590341100928818283847040957859628079097614797558944507836526869287069518221286913741352747181248696434950235270",
		"33366447563008305462240784490011221734403555470483237044237519855728467574384618183775842796507",
		"898777213113437172813791097327200023029247316887967762531928278944502049608534003310647471002417720772649471170889260879",
		"621244669818545290736580742613211007044590138",
		"1265870384476847373209149971428361615530",
		"74257095004955059015219224181712845",
		"28012707681212521389",
		"1863970997",
		"735181210078119",
		"8442989048",
		"89239",
		"88199",
		"37455",
		"28360",
		"31810"
	],
	"counter" : 0
}

v2.2

This test starts out at ~900 insert ops per second, and on a VM with ~300mb of cache available, the insert rate drops to ~50 inserts per second within 15 minutes. At this point, the total document size is 281mb (full stats posted above). Attached gperf call graph of this run (profile_768mb_insert_281mb.pdf).

iostat shows a readMB to writeMB ratio of ~4:1 by the end of this run. Responsiveness of mongod is very poor once we start paging in data, with several 20-30 second gaps between serverStatus results in mongostat.

Stats:

> db.dataset.stats()
{
	"ns" : "test.dataset",
	"count" : 236742,
	"size" : 281357924,
	"avgObjSize" : 1188.4580006927372,
	"storageSize" : 335896576,
	"numExtents" : 14,
	"nindexes" : 3,
	"lastExtentSize" : 92581888,
	"paddingFactor" : 1,
	"systemFlags" : 1,
	"userFlags" : 0,
	"totalIndexSize" : 442256192,
	"indexSizes" : {
		"_id_" : 7693616,
		"strField_1" : 40896352,
		"simpleArray_1" : 393666224
	},
	"ok" : 1
}

v2.0

Responsiveness is much better with v2.0. Below are the v2.0 stats (everything seems nearly identical). I've attached the gperf callgraph (profile_inserts_slow_200mb_cache_v20.pdf) for comparison as well.

Stats:

	"ns" : "test.dataset",
	"count" : 236742,
	"size" : 281328304,
	"avgObjSize" : 1188.3328855885309,
	"storageSize" : 301076480,
	"numExtents" : 15,
	"nindexes" : 3,
	"lastExtentSize" : 68579328,
	"paddingFactor" : 1,
	"flags" : 1,
	"totalIndexSize" : 441250544,
	"indexSizes" : {
		"_id_" : 7693616,
		"strField_1" : 41035344,
		"simpleArray_1" : 392521584
	},
	"ok" : 1



 Comments   
Comment by Ben Becker [ 16/Jan/14 ]

Retested on a Debian 7 VM running in VMWare Fusion 6, but cannot reproduce when comparing v2.0.9 to v2.2.6. Version 2.2.6 is consistently 10% faster than v2.0.9 (stddev is ~2%). v2.4.9 is on par with v2.2.6.

Comment by Ben Becker [ 15/Oct/12 ]

@Tad, it's the same script, but I've modified a few thing since the initial run. Specifically 'sh.test' was used for a sharded version of the same test, and the count was simply due to lack of patience. Sorry for the confusion.

Thanks for running on Windows. My focus was on a memory-constrained system, but perhaps IO is unrelated to this regression.

Comment by Tad Marshall [ 14/Oct/12 ]

For comparison, I tried running the generate_random.js script on Windows.

The script doesn't match the findOne() or stats() output above. It writes to namespace sh.test, not test.dataset, and it write 300,000 documents, not 236,742, all much smaller than the examples above. It create 4 indexes, not 3.

> db.test.findOne()
{
        "_id" : ObjectId("507aa410b05c8c0648033a2f"),
        "id" : "13414139605203486653734199766846645089044810441488",
        "nestedDoc" : {
                "a" : 1,
                "b" : 2,
                "c" : 3
        },
        "simpleArray" : [
                "90293",
                "31096",
                "86555"
        ],
        "counter" : 0
}
> db.test.stats()
{
        "ns" : "sh.test",
        "count" : 300000,
        "size" : 62147604,
        "avgObjSize" : 207.15868,
        "storageSize" : 86114304,
        "numExtents" : 11,
        "nindexes" : 4,
        "lastExtentSize" : 20881408,
        "paddingFactor" : 1,
        "flags" : 1,
        "totalIndexSize" : 78105328,
        "indexSizes" : {
                "_id_" : 9745792,
                "strField_1" : 5706848,
                "simpleArray_1" : 32434192,
                "id_1" : 30218496
        },
        "ok" : 1
}
>

Ben, can you attach the script you used to generate this output?

Comparing 2.0 with 2.2 on Windows 7 working in an empty directory on a RAID 0 partition with a 64 KB cluster size and 128 KB stripe size, 3.3 GHz six core CPU with 12 GB RAM, running this (different) script, 2.0 completed in 58.795 seconds and 2.2 took 69.575 seconds, so a bit over 18% slower.

Generated at Thu Feb 08 03:14:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.