[SERVER-14121] Store size for chunk in config.chunks. Created: 31/May/14 Updated: 13/Jul/21 Resolved: 22/Aug/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Murphy | Assignee: | Ramon Fernandez Marina |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
Mongo currently will sometimes not shard chunks and we have seen 200M, 2000M 20000M chunks at times. To avoid this anytime a migration or split occurs mongo should update the chunks "size" into the config.chunks documents. This would make it easier to correct this , since jumbo is not ALWAYS set correctly. The result is we could have tools to correct chunks until such time chunk splitting is moved to the mongod nodes, and possible even then if similar issue with detect still occur. |
| Comments |
| Comment by David Murphy [ 22/Aug/14 ] |
|
Ramon, I think you might want to look at our CS account where you will find a very large number of issues around this topic on large chunks. Which are actually very much expected due to the way splits work and why that is a major focus in the 2.7/2.8 build tree. Secondly adding fields to the meta data is not actually something that will break it ( this has already been tested via a patch). Something is needed for 2.2/2.4 as people still use these and its a real issue that limits and breaks mongo's ability to scale. David |
| Comment by Ramon Fernandez Marina [ 22/Aug/14 ] |
|
dmurphy, the changes you suggest in this ticket would require us to modify the format of the metadata stored in config servers. This is not a trivial change: it has a large impact on the sharding system and cannot be introduced in a minor release version. That being said, the large chunks you report are not expected, and it would be good to understand why. If you could open a SERVER ticket and provide the logs at logLevel 1 of a mongod where at least one of these large chunks appears we could investigate why they are not being split. Regards, |
| Comment by David Murphy [ 20/Jun/14 ] |
|
We have seen large chunks cause elections and massive locking. The point here is when a split or move occurs we could record a size estimate for the moment ( maybe datasize results). This would prevent the next to scan 6000 chunks durinf a scan to find chunk sizes when shards are imbalanced for size but not chunks. We realize this is going to be better but this is a real issue short term and the next version is a good ways out. |
| Comment by Thomas Rueckstiess [ 05/Jun/14 ] |
|
Hi David, We are working on improving splitting behavior and keeping track of chunk sizes more accurately, and you're right this has to be done on the shards themselves where the data resides. This is work in progress and will be addressed in an upcoming version. Keeping track of the chunk sizes in the config servers introduces overhead and is probably not the right way forward. In the meantime, perhaps you can use the dataSize command which can return the size of data contained in a range (chunk). Regards, |