[SERVER-11441] parameter maxSize is either useless or does not work Created: 29/Oct/13  Updated: 12/Jun/14  Resolved: 12/Jun/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kay Agahd Assignee: Unassigned
Resolution: Duplicate Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

linux 64 bit


Issue Links:
Duplicate
duplicates SERVER-2246 Pay (more) attention to sharding maxSize Closed
Operating System: ALL
Steps To Reproduce:

use admin
db.runCommand( { addshard : "bar/localhost:20010"} );
db.runCommand( { addshard : "foo/localhost:20011", maxSize:500 } );
db.runCommand( { addshard : "baz/localhost:20012", maxSize:1000 } );
use config
db.settings.save( { _id:"chunksize", value: 16 } )//set from 64 MB to 16 MB to see early distribution of chunks
use test
 
mongos> db.dropDatabase()
{ "dropped" : "test", "ok" : 1 }
mongos> use admin
switched to db admin
mongos> db.runCommand( { enablesharding : "test" } );
{ "ok" : 1 }
mongos> db.runCommand( { shardcollection : "test.test", key : { _id : 1 } } )
{ "collectionsharded" : "test.test", "ok" : 1 }
mongos> use test
switched to db test
mongos> db.test.ensureIndex({_id:"hashed"});//use hashed shard key to guarantee optimal distribution
mongos> for(i=0;i<1024*3000;i++){//insert at least 3000 MB
...   db.test.insert({count:i, textfield:"round about 1024 Bytes of blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah"});
...   if(i%(1024*10)==0) {
...     var s = db.test.stats().shards;
...     print((s.bar!=null?s.bar.storageSize:0) + ";" + (s.foo!=null?s.foo.storageSize:0) + ";" + (s.baz!=null?s.baz.storageSize:0));
...   }
... }
8192;0;0
696320;11182080;11182080
...some lines cut out...
1580060672;2140508160;857399296
//as one can see, shard foo has over 2 GB storageSize even though its maxSize was set to 500 MB!
//collection stats says the same, even after 1 day, so the balancer did nothing:
mongos> db.test.stats()
{
	"sharded" : true,
	"ns" : "test.test",
	"count" : 3072000,
	"numExtents" : 56,
	"size" : 3489791712,
	"storageSize" : 4577968128,
	"totalIndexSize" : 256366656,
	"indexSizes" : {
		"_id_" : 109754624,
		"_id_hashed" : 146612032
	},
	"avgObjSize" : 1135.99990625,
	"nindexes" : 2,
	"nchunks" : 198,
	"shards" : {
		"bar" : {
			"ns" : "test.test",
			"count" : 1174328,
			"size" : 1334036464,
			"avgObjSize" : 1135.9998773766783,
			"storageSize" : 1580060672,
			"numExtents" : 19,
			"nindexes" : 2,
			"lastExtentSize" : 415145984,
			"paddingFactor" : 1,
			"systemFlags" : 1,
			"userFlags" : 0,
			"totalIndexSize" : 103311936,
			"indexSizes" : {
				"_id_" : 47281808,
				"_id_hashed" : 56030128
			},
			"ok" : 1
		},
		"baz" : {
			"ns" : "test.test",
			"count" : 521492,
			"size" : 592414912,
			"avgObjSize" : 1136,
			"storageSize" : 857399296,
			"numExtents" : 17,
			"nindexes" : 2,
			"lastExtentSize" : 227786752,
			"paddingFactor" : 1,
			"systemFlags" : 1,
			"userFlags" : 0,
			"totalIndexSize" : 43079344,
			"indexSizes" : {
				"_id_" : 16940672,
				"_id_hashed" : 26138672
			},
			"ok" : 1
		},
		"foo" : {
			"ns" : "test.test",
			"count" : 1376180,
			"size" : 1563340336,
			"avgObjSize" : 1135.9998953625252,
			"storageSize" : 2140508160,
			"numExtents" : 20,
			"nindexes" : 2,
			"lastExtentSize" : 560447488,
			"paddingFactor" : 1,
			"systemFlags" : 1,
			"userFlags" : 0,
			"totalIndexSize" : 109975376,
			"indexSizes" : {
				"_id_" : 45532144,
				"_id_hashed" : 64443232
			},
			"ok" : 1
		}
	},
	"ok" : 1
}
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
	"_id" : 1,
	"version" : 3,
	"minCompatibleVersion" : 3,
	"currentVersion" : 4,
	"clusterId" : ObjectId("526e28e343e8fadadc7f5450")
}
  shards:
	{  "_id" : "bar",  "host" : "bar/localhost:20010" }
	{  "_id" : "baz",  "host" : "baz/localhost:20012",  "maxSize" : NumberLong(1000) }
	{  "_id" : "foo",  "host" : "foo/localhost:20011",  "maxSize" : NumberLong(500) }
  databases:
	{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
	{  "_id" : "test",  "partitioned" : true,  "primary" : "bar" }
		test.test
			shard key: { "_id" : 1 }
			chunks:
				bar	87
				baz	34
				foo	77
			too many chunks to print, use verbose if you want to force print
 

Participants:

 Description   

We added a new shard which have less disk space than the other shard. To avoid running out of disk space we added the new shard by using the maxSize parameter. However, the shard got overloaded.

Mongodb docs (http://docs.mongodb.org/manual/tutorial/configure-sharded-cluster-balancer/#change-the-maximum-storage-size-for-a-given-shard) states out:

"The maxSize field [ . . . ] sets the maximum size for a shard, allowing you to control whether the balancer will migrate chunks to a shard. If mapped size is above a shard’s maxSize, the balancer will not move chunks to the shard. Also, the balancer will not move chunks off an overloaded shard. This must happen manually. The maxSize value only affects the balancer’s selection of destination shards."

So, what is maxSize good for, if maxSize affects only the balancer's selection of destination shards?
Does it really mean that new documents will be inserted even when the shard has reached already its maxSize? And once inserted, you hope that the balancer moves it to another shard before the shard gets overloaded because in case of overloading, the balancer will not even move chunks off the overloaded shard?

Wouldn't it be better to take into account maxSize before inserting new data, so the balancer wouldn't even need to move it back? Moreover, the shard would really never reach it's maxSize (as long as there are still other shards with remaining hard disk space).



 Comments   
Comment by Greg Studer [ 12/Jun/14 ]

The parameter is poorly named - it is really meant for situations, like adding new shards, where you want to disable balancing to certain shards. The balancer is not yet adaptive - https://jira.mongodb.org/browse/SERVER-9477?jql=labels%20%3D%20balancingStrategy.

> Moreover, the shard would really never reach it's maxSize (as long as there are still other shards with remaining hard disk space).
It's not possible to store a document on any shard - only the particular shard on which the correct chunk exists.

Generated at Thu Feb 08 03:25:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.