[SERVER-9498] Possible bug in SplitVector: Mongodb keeps on splitting the same chunk over and over again for hours Created: 29/Apr/13  Updated: 10/Dec/14  Resolved: 29/May/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.2.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Weibull Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File balancer.prob.log    
Issue Links:
Duplicate
duplicates SERVER-9365 mongod always split at 250000 position Closed
Operating System: ALL
Participants:

 Description   

This happens when the balancer is trying to move a chunk that is too big to move. It issues a "forced split" to reduce the size of the chunk. The split succeeds, but it splits the chunk on the 250000th document in the chunk. By the time mongodb tries to move the same chunk again it will be too big to move. It will then issue another "forced split". The "forced split" will result in a spilt on the 250000th document in the chunk. This pattern will sometimes go on for several hours. It will result in several hundreds or sometimes thousands of "small" chunks containing just a hand full of documents.



 Comments   
Comment by David Weibull [ 02/May/13 ]

I should add that the problem only exists if we have a constant high insertion rate. Stop inserting documents and it will solve it self.

Comment by David Weibull [ 02/May/13 ]

When splitVector (s/d_spilt.cpp) is run with force set to true, keyCount is set to 250000 (maxChunkObjects). This results in a split point at the 250000th document if the chunk contains more than 250000 documents. In my case the chunk seems to contain more than 250000 docs.

If the chunk is split at the 250000th document and we add one document to the chunk before we try to move it again, then the move will fail once again. It will result in a "forced split". We will then set keyCount to 250000 and split the chunk at the 250000th document .... it will now only take that one document is added to the chunk
before we try to move it again ... and the move will fail once again and issue a "forced split"

I think the developers of splitVector assumed that a chunk never can contain more than 250000 documents. Because if the chunk contains 250000 docs or less splitVector will find the "middle" of the chunk and split it on that doc. (It will reset keyCount to "number of documents in chunk"/2 and iterate another round over the documents in the chunk until it reaches the keyCount doc and then use that doc as a split point.). In this case everything works out fine and the chunk is moved the next time we try to move it.

Comment by Johan Hedin [ 29/Apr/13 ]

This seem to be related as well: https://groups.google.com/d/topic/mongodb-user/nJG8WlCdta0/discussion

Comment by Johan Hedin [ 29/Apr/13 ]

This seem to bee the same problem as https://jira.mongodb.org/browse/SERVER-9365

Generated at Thu Feb 08 03:20:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.