[DOCS-2961] audit maximum initial sharded collection size Created: 21/Mar/14  Updated: 22/Mar/16  Due: 24/Mar/14  Resolved: 14/Oct/14

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: None
Fix Version/s: v1.3.12

Type: Task Priority: Major - P3
Reporter: Greg Studer Assignee: Sam Kleinman (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File initial_sharding_limits.js    
Issue Links:
Duplicate
is duplicated by DOCS-3154 Sharding limitation needs a "Contact ... Closed
Related
Participants:
Days since reply: 9 years, 18 weeks ago

 Comments   
Comment by Githook User [ 15/Oct/14 ]

Author:

{u'username': u'tychoish', u'name': u'Sam Kleinman', u'email': u'samk@10gen.com'}

Message: DOCS-2961: table of max sharded collection size
Branch: master
https://github.com/mongodb/docs/commit/85c8032ca6504b1ca1c646fade818dbff95b1c06

Comment by Randolph Tan [ 25/Mar/14 ]

update:

Performed the same experiment again, now without fiddling the max bson size and here are the results for the worst case:

Maximum docs before splitVector fails: 16427
This is lower than the approx 16594 earlier. Also note that in this run, I also changed the key size to be 1003 because I found out that chunk boundaries are indexed in config.chunks, with the format: <dbname>.<collname>-<shardkey>_<keyvalue> (for example a.b-k_0), so the max size for the shard key is actually smaller than 1011.

And here's the result to the shardCollection command when 16428 docs were inserted:

{
	"code" : 13345,
	"ok" : 0,
	"errmsg" : "exception: splitVector command failed: { timeMillis: 36, errmsg: \"exception: BSONObj size: 16793716 (0x1004074) is invalid. Size must be between 0 and 16793600(16MB) First element: 0: { k: \"10000xxxxxxxxxxxxxxxxxxxxx...\", code: 10334, ok: 0.0 }"
}

I also outputted the db and coll stats right before issuing the shardCollection command:

dbStats: {
	"raw" : {
		"localhost:30000" : {
			"db" : "a",
			"collections" : 3,
			"objects" : 16433,
			"avgObjSize" : 1048177.176169902,
			"dataSize" : 17224695536,
			"storageSize" : 18562846640,
			"numExtents" : 25,
			"indexes" : 2,
			"indexSize" : 22933680,
			"fileSize" : 21398290432,
			"nsSizeMB" : 16,
			"dataFileVersion" : {
				"major" : 4,
				"minor" : 5
			},
			"extentFreeList" : {
				"num" : 0,
				"totalSize" : 0
			},
			"ok" : 1
		}
	},
	"objects" : 16433,
	"avgObjSize" : 1048177,
	"dataSize" : 17224695536,
	"storageSize" : 18562846640,
	"numExtents" : 25,
	"indexes" : 2,
	"indexSize" : 22933680,
	"fileSize" : 21398290432,
	"extentFreeList" : {
		"num" : 0,
		"totalSize" : 0
	},
	"ok" : 1
}
collStats: {
	"sharded" : false,
	"primary" : "shard0000",
	"ns" : "a.b",
	"count" : 16427,
	"size" : 17224695120,
	"avgObjSize" : 1048560,
	"storageSize" : 18562830256,
	"numExtents" : 23,
	"nindexes" : 2,
	"lastExtentSize" : 2146426864,
	"paddingFactor" : 1,
	"systemFlags" : 1,
	"userFlags" : 1,
	"totalIndexSize" : 22933680,
	"indexSizes" : {
		"_id_" : 539616,
		"k_1" : 22394064
	},
	"ok" : 1
}

I have also attached the script for those who would want to explore this further and tweak the values of the document size and key size.

Comment by Sam Kleinman (Inactive) [ 21/Mar/14 ]

http://docs.mongodb.org/manual/reference/limits/#Sharding-Existing-Collection-Data-Size

Generated at Thu Feb 08 07:44:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.