[SERVER-22290] endless "moveChunk failed, because there are still n deletes from previous migration" Created: 25/Jan/16  Updated: 13/Apr/16  Resolved: 28/Mar/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.10
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kay Agahd Assignee: Kelsey Schubert
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-14047 endless "moveChunk failed, because t... Closed
Operating System: ALL
Steps To Reproduce:

Create 1.000 new empty chunks and try to move them among all shards.

Participants:

 Description   

We have several sharded clusters running mongodb v2.6.10. We regularely pre-split so that all new documents will be inserted evenly among all shards. The balancer is always off because we prefer to distribute our documents manually dependent on the hardware (RAM) since not all shards have the same amount of RAM.

In order to pre-split, we create new chunks with sh.splitAt. Once hundreds or thousands new empty chunks are created, we move them from the origin shard evenly distributed to the other shards by using sh.moveChunk. This should be a very quick operation because the chunks to move are empty.

From time to time we encounter the following error. The bigger the cluster, the more often the error seems to happen.

{
        "cause" : {
                "cause" : {
                        "ok" : 0,
                        "errmsg" : "can't accept new chunks because  there are still 8 deletes from previous migration"
                },
                "ok" : 0,
                "errmsg" : "moveChunk failed to engage TO-shard in the data transfer: can't accept new chunks because  there are still 8 deletes from previous migration"
        },
        "ok" : 0,
        "errmsg" : "move failed"
}

The error may also happen after all shards have received already empty chunks, so they already have accepted new chunks. However, some seconds later they refuse new chunks, telling that "there are still n deletes from previous migration" even though all previous received chunks were all empty! This seems very illogical for us. Can you explain or fix it?

The only workaround we found so far is to step down the master of the TO-shard. However, if all 3 replSet-members of the TO-shard throw the same error, we need to restart a secondary, elect it primary and then we are able to continue the distribution of new empty chunks - until the next error "can't accept new chunks" arrives.

Please see also SERVER-14047 which I have created for the same problem. However, this time it seems not to be related to noTimeoutCursors because they have been killed by restarting the server(s). Also the shards accepted already new chunks. They stop to accept new chunks out of the blue sky with an illogical error message.



 Comments   
Comment by Kay Agahd [ 13/Apr/16 ]

Hi ramon.fernandez, as I said already we need to pre-split only every few months, so I could not give you feedback earlier. Today we pre-split again without any problem.
The only difference to the last pre-split is that we did not balance manually for the last 4 weeks. Before, we needed to balance manually once per night in order to take into account different amount of RAM of the different shards. Unfortunately mongodb's built-in balancer is neither smart enough to take this into account itself nor can it be configured in this way.
Since all our shards are now equipped with the same amount of RAM, we don't need to balance anymore. This may be the reason that pre-splitting and distributing all newly created empty chunks equally to all shards worked without any problem today.
I may report again if the issue should occur again.

Comment by Ramon Fernandez Marina [ 28/Mar/16 ]

kay.agahd@idealo.de, if no further information can be provided for another 2.5 months we're going to close this ticket for the time being. When you have additional information please post here and we'll reopen for further investigation.

Note that we believe the root cause for this behavior is the same as for SERVER-14047, namely the shards have open cursors. The shard logs should be able to confirm that.

Regards,
Ramón.

Comment by Kay Agahd [ 11/Mar/16 ]

Yes, we will do so as soon as we pre-split again. We need to pre-split only once every 3 months round about.

Comment by Kelsey Schubert [ 10/Mar/16 ]

Hi kay.agahd@idealo.de,

Sorry for the delay getting back to you. If this is still an issue for you, can you please upload the logs from the primary of donor shard and the primary of the target shard when you are experiencing this issue during chunk migration? This information will allow us to verify our explanation of the root cause of this behavior.

Thank you,
Thomas

Comment by Kay Agahd [ 25/Jan/16 ]

Btw. the waitForDelete option did not help either.

db.getSisterDB("config").getCollection("settings").update({ "_id" : "balancer" },{ $set : { "_waitForDelete" : true } }, { upsert : true });

Generated at Thu Feb 08 03:59:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.