[SERVER-36059] Can't move jumbo chunks even after changing chunk size Created: 10/Jul/18  Updated: 04/Sep/18  Resolved: 27/Jul/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Eric Herbrandson Assignee: Kaloian Manassiev
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-13024 Clear the 'jumbo' flag from the chunk... Blocked
Participants:

 Description   

I'm in the process of migrating from a single replica set to a sharded topology. During the initial process, I selected a chunk size that was too small and resulted in ~50% of the chunks being flagged as "jumbo". To resolve this issue, I reconfigured the chunk size and followed the instructions here to clear the jumbo flag. However, after multiple days there are no further migrations happening. But, none of the chunks have been re-flagged as "jumbo" either.

I've tried manually stopping and restarting the balancer, but that didn't seem to have any effect. I tried manually editing data in some of the jumbo chunks but that didn't have any effect either.

Running sh.status() shows the following...

balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
            No recent migrations

This is especially problematic because the unmoved chunks are all on a shard that I'm attempting remove from the cluster. As such, leaving them where they are isn't an option.

It looks like there may be a mechanism to manually move chunks, but this isn't really a valid option for us either as there are ~45k chunks that need to be moved.

How do I get these chunks moved?



 Comments   
Comment by Eric Herbrandson [ 27/Jul/18 ]

@Kaloian Manassiev I should also say that I'm not sure this is the same as SERVER-13024 because in my case I manually cleared out the jumbo flags per the instuctions at https://docs.mongodb.com/manual/tutorial/clear-jumbo-flag/ and that didn't resolve the issue. Instead, after a few days all of the chunks were re-flagged as jumbo.

Comment by Eric Herbrandson [ 27/Jul/18 ]

@Kaloian Manassiev Thanks for your response. However, please see my last comment. Even when I set the chunksize before starting the sharding process, I still get jumbo chunks that are less then chunksize. Will mongo not move any chunks over 256Mb?

Comment by Kaloian Manassiev [ 27/Jul/18 ]

Hi herbrandson,

I am really sorry for the delay in getting back to you. I was just able to reproduce the problem that you described - thank you for your patience!

After some digging I realized that this is something we are already aware of and is tracked under SERVER-13024. Essentially, once a chunk is marked as jumbo, the only way to unset that flag is to actually split the chunk (which would rewrite the chunk document on the config server).

I am going to close your ticket as duplicate of SERVER-13024.

In the mean time, if you cannot split the jumbo chunks, a workaround for your problem would be to set the chunksize setting before sharding the collection. That way they will be balanced right after sharding it and won't get marked as jumbo.

Best regards,
-Kal.

Comment by Eric Herbrandson [ 25/Jul/18 ]

More info. I decided to just retry the process from scratch making sure that the chunksize was 1024 from the start. Unfortunately, this had the same result. Any chunk over 256 was flagged as jumbo and not moved. Will mongo not move any chunks over 256Mb?

Comment by Eric Herbrandson [ 17/Jul/18 ]

ping

Comment by Eric Herbrandson [ 12/Jul/18 ]

Thank you so much for your reply!

We are running v3.6.3

I went back to grab the logs and now I see this

2018-07-12T04:44:26.573+0000 W SHARDING [Balancer] Unable to find any chunk to move from draining shard rs0. numJumboChunks: 45283
2018-07-12T04:44:26.577+0000 W SHARDING [Balancer] Shard: rs0, collection: liveearth.entityEvents has only jumbo chunks for zone '' and cannot be balanced. Jumbo chunks count: 45283

I then checked the `chunks` collection and sure enough, they're all marked as jumbo again!

Here is an example

{ 
    "_id" : "liveearth.entityEvents-feedId_UUID(\"02e1971a-8fdd-4f01-aa9a-bc8ab9bbe270\")timekey_new Date(1522875600000)", 
    "ns" : "liveearth.entityEvents", 
    "min" : {
        "feedId" : BinData(4, "AuGXGo/dTwGqmryKubvicA=="), 
        "timekey" : ISODate("2018-04-04T21:00:00.000+0000")
    }, 
    "max" : {
        "feedId" : BinData(4, "AuGXGo/dTwGqmryKubvicA=="), 
        "timekey" : ISODate("2018-04-04T22:00:00.000+0000")
    }, 
    "shard" : "rs0", 
    "lastmod" : Timestamp(1, 29), 
    "lastmodEpoch" : ObjectId("5b2cbeab879751415114dbac"), 
    "jumbo" : true
}

The `timekey` values are always on hour boundaries, so this effectively means there is only one key in this chunk.

To verify the size of that chunk I ran the following query

var cursor = db.getCollection("entityEvents").find({
	"feedId" : BinData(4, "AuGXGo/dTwGqmryKubvicA=="), 
    "timekey" : ISODate("2018-04-04T21:00:00.000+0000")
})
 
 
var size = 0;
cursor.forEach(function(doc){
	size += Object.bsonsize(doc)
});
 
 
print(size);

Which output a size for this chunk of 458978276 (437.7Mb)

From the `settings` collection, I verified that the configured `chunksize` is larger then the above chunk 

{ 
    "_id" : "chunksize", 
    "value" : 1024.0
}

 

Am I doing something wrong?

Comment by Kaloian Manassiev [ 12/Jul/18 ]

Hi herbrandson,

Which version of MongoDB are you running? From the output of sh.status() it doesn't look like the balancer is failing due to the presence of jumbo chunks and like you said the chunks have not been re-tagged as jumbo, so there must be some other problem.

Would it be possible to attach the most-recent logs from the config server primary which is where the balancer is running (assuming you are using version 3.4 or later)?

Best regards,
-Kal.

Generated at Thu Feb 08 04:41:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.