[SERVER-9945] Hard coded MaxObjectPerChunk limit Created: 17/Jun/13  Updated: 10/Dec/14  Resolved: 26/Aug/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Thomas Adam Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

Hi,

why ist the MaxObjectPerChunk hard coded in line: https://github.com/mongodb/mongo/blob/master/src/mongo/s/chunk.cpp#L54?
We want to move a chunk to a other shard, but the move chunk command will be aborted because "chunk too big to move".

I found that in line: https://github.com/mongodb/mongo/blob/master/src/mongo/s/d_migrate.cpp#L408 the calculated maxRecsWhenFull will be reseted to the hard coded MaxObjectPerChunk limit if its greater.

Our configured chunk size is 64mb and the chunk has a size of ~32mb, so it should be moved without problems.
Why this limit?

I added to the failed response some values to validate this and here is the output:

{
	"cause" : {
		"chunkTooBig" : true,
		"estimatedChunkSize" : 34355880,
		"recCount" : 477165,
		"avgRecSize" : 72,
		"maxRecsWhenFull" : 250001,
		"maxChunkSize" : 67108864,
		"totalRecs" : 47328811,
		"Chunk::MaxObjectPerChunk" : 250000,
		"ok" : 0,
		"errmsg" : "chunk too big to move"
	},
	"ok" : 0,
	"errmsg" : "move failed"
}

So a chunk with over 250000 entries can't be moved? IMHO it's not expacted or? We think for that reason is our cluster not good distributed. We have many abort logs in the config.changelog.

Any thoughts?

Thanks & Regards
Thomas



 Comments   
Comment by David Hows [ 26/Aug/13 ]

Hi Thomas,

I'm marking this ticket as resolved as this is now being dealt with on your private ticket.

The solution to this issue looks to be the changes being made as part of SERVER-8598 and SERVER-8869.

Regards,
David

Comment by Johan Hedin [ 18/Jun/13 ]

This sounds quite related to SERVER-9365.

A chunk that is "too big to move" will be "force split" but the bug reported in SERVER-9365 prevents mongo from split the chunk correctly.

Same root case as in this one; small document size with respect to max chunk size.

I posted some comments about this in the discussion on the mailing list https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/leyazB-1zec

Comment by Thomas Adam [ 17/Jun/13 ]

Ok, I have created the ticket SUPPORT-608. Thanks

Comment by Scott Hernandez (Inactive) [ 17/Jun/13 ]

Yes, please create a Community Private issue (private to the submitter and support), upload the logs there and then we can link them together without exposing your private information here.

Comment by Thomas Adam [ 17/Jun/13 ]

Short question. Can we make the ticket private about our intern logs? Thanks

Comment by Thomas Adam [ 17/Jun/13 ]

We use 2.4.4.

We use a hashed shard key of a MongoID. I run the following command:

db.runCommand({moveChunk:"lovoo.visits.in", bounds:[{ "u" : NumberLong("-8154696824717988795") }, { "u" : NumberLong("-8070450532247928776") }], to: "lovoo3" });

I can split the given chunk, but this not what I want. I have splitted the chunk bevor, because of the same message and one of the new chunks (the above) has the same problem!

The logs prints only the same information which the response give me.
, the primary which should move does not log anything about it.

Log of mongos:

Mon Jun 17 14:57:14.564 [conn50] CMD: movechunk: { moveChunk: "lovoo.visits.in", bounds: [ { u: -8154696824717988795 }, { u: -8070450532247928776 } ], to: "lovoo3" }
Mon Jun 17 14:57:14.564 [conn50] moving chunk ns: lovoo.visits.in moving ( ns:lovoo.visits.inshard: lovoo:lovoo/10.2.0.204:27017,10.2.0.205:27017,10.2.0.206:27017lastmod: 66|1||000000000000000000000000min: { u: -8154696824717988795 }max: { u: -8070450532247928776 }) lovoo:lovoo/10.2.0.204:27017,10.2.0.205:27017,10.2.0.206:27017 -> lovoo3:lovoo3/10.2.0.201:27017,10.2.0.202:27017,10.2.0.203:27017
Mon Jun 17 14:57:16.855 [LockPinger] cluster 10.2.0.215:27019,10.2.0.216:27019,10.2.0.217:27019 pinged successfully at Mon Jun 17 14:57:16 2013 by distributed lock pinger '10.2.0.215:27019,10.2.0.216:27019,10.2.0.217:27019/dev.lovoo.net:27018:1371466593:1804289383', sleeping for 30000ms
Mon Jun 17 14:57:18.048 [conn50] moveChunk result: { chunkTooBig: true, estimatedChunkSize: 34451208, recCount: 478489, avgRecSize: 72, maxRecsWhenFull: 250001, maxChunkSize: 67108864, totalRecs: 47474227, Chunk::MaxObjectPerChunk: 250000, ok: 0.0, errmsg: "chunk too big to move" }

Comment by Scott Hernandez (Inactive) [ 17/Jun/13 ]

What version are you using? What is the chunk definition look like which cannot be moved? If it can be split it will try to do so if possible before moving it, did that have an error.

If you upload the server logs and mongos logs for that time it would be helpful.

Generated at Thu Feb 08 03:21:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.