[SERVER-40061] Chunk move fails due to DuplicateKey error on the `config.chunks` collection at migration commit Created: 10/Mar/19 Updated: 09/Aug/19 Resolved: 09/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.6.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Aristarkh Zagorodnikov | Assignee: | Kaloian Manassiev |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Sharding 2019-05-20, Sharding 2019-06-03, Sharding 2019-06-17, Sharding 2019-07-01, Sharding 2019-07-15, Sharding 2019-07-29, Sharding 2019-08-12 | ||||||||
| Participants: | |||||||||
| Description |
|
The ChunkType::genID method uses the BSONElement::toString method, which was changed to provide a better formatting for UUID BinData. Unfortunately, the ChunkType::genID is used all around sharding-related code as a value of "_id" field in "config.chunks" collection. When the chunk minimum field has a value that is an UUID, the value of "_id" for v3.6 and v3.4 (and previous versions) differ. We've hit it when trying to move chunks manually in a cluser we recently moved from v3.4 to v3.6:
Of course, the "config.chunks" collection contains this:
Since I do not know what other operations are using the "_id" field, I cannot estimate the true potential of this problem, but cursory inspection of the codebase shows there are at least some places where the update is performed without checking the number of matched/modified documents, so it may be the case that the metadata (I mean the chunk structure) could be lost/damaged silently. |
| Comments |
| Comment by Kaloian Manassiev [ 09/Aug/19 ] | ||||||||||||||||||||||
|
chenjian@tmxmall.com, let's continue the conversation in | ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 09/Aug/19 ] | ||||||||||||||||||||||
onyxmaster, these are the correct workaround steps for the situation that your cluster ended up in. This is a side effect of We recently fixed it in the current master branch (4.3 as of now) and will be exploring whether it is possible to backport it to earlier releases. Apologies again for the inconvenience and for the time it took. | ||||||||||||||||||||||
| Comment by Chen Jian [ 09/Aug/19 ] | ||||||||||||||||||||||
|
Thank you, here is the new ticket: | ||||||||||||||||||||||
| Comment by Chen Jian [ 09/Aug/19 ] | ||||||||||||||||||||||
|
Thank you for your reply, at the beginning, our problem like this Issue: Now one of the primary shard is down and can not start again, could tell me the method to start the stopped shard? | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 09/Aug/19 ] | ||||||||||||||||||||||
|
We "fixed" the issue by stopping the balancer and rewriting the chunks collection manually, fixing the IDs. | ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 09/Aug/19 ] | ||||||||||||||||||||||
|
My investigation so far has concluded that the issue originally reported here is related to The case that you reported seems to be happening on the routing metadata cache collections and during replication startup recovery, so it is likely a different issue. Would it be possible to create a separate ticket and to include the complete latest mongod log of the node you are trying to start? Best regards, | ||||||||||||||||||||||
| Comment by Chen Jian [ 09/Aug/19 ] | ||||||||||||||||||||||
|
Thank you for your reply, at the beginning, our promble like this Issue:https://jira.mongodb.org/browse/SERVER-11421, we got the error log and it didn't stop, and make our disk out of memory. | ||||||||||||||||||||||
| Comment by Chen Jian [ 09/Aug/19 ] | ||||||||||||||||||||||
|
I also find this promble in version 4.0.9. I can't start my shard, and I get the following error information:
| ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 05/Jul/19 ] | ||||||||||||||||||||||
|
It appears that automatic migrations of chunks that have UUIDs as keys doesn't work too. The chunk migrates, but commit fails with this:
Is there any progress on this issue? Not being able to move chunks either manually or automatically will lead to shards running out of disk space. Is there any workaround available? | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 14/May/19 ] | ||||||||||||||||||||||
|
Kaloian, thank you for returning back to this issue. I've uploaded the fresh zipped dump of the cluster's config database to the location you provided. | ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 14/May/19 ] | ||||||||||||||||||||||
|
Hi onyxmaster, I apologise for the delay in looking into this issue. We started investigating it yesterday and found out that due to the time that has passed, the storage on which I saved the dump of the config server that you provided me has expired and the archive was deleted. I am going to have to ask you to please upload it again. To make it easier, I’ve created a secure upload portal for you to use. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time. Apologies again and many thanks for helping us investigate this issue! Best regards, | ||||||||||||||||||||||
| Comment by Danny Hatcher (Inactive) [ 16/Apr/19 ] | ||||||||||||||||||||||
|
amardeepsg@gmail.com, we do not currently have a fix prepared but will update this ticket when we do. Regardless, we do not update patch versions like 3.6.7. If we do decide to backport a fix to 3.6, it would be to the latest 3.6.x version.
For advice on shard balancing, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag. Thanks, Danny | ||||||||||||||||||||||
| Comment by amardeep singh [ 15/Apr/19 ] | ||||||||||||||||||||||
|
Any change of this getting closed on 3.6.7? Also, what are the alternate ways for us to ensure shard data balancing, if empty chunks continue to exist and we cant merge them. | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 16/Mar/19 ] | ||||||||||||||||||||||
|
Kaloian, thank you! I'm feeling relieved knowing that there is no data loss involved. Thank you for investigating this, looking forward to the fix. I hope this gets backported to 3.6 too, since we're not yet ready to move to 4.x (at least until this summer). | ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 15/Mar/19 ] | ||||||||||||||||||||||
|
Hi onyxmaster, Thank you very much for uploading your config server's metadata! Thanks to it I was able to reproduce the problem locally and now we have everything we need in order to continue investigating this issue, so feel free to remove the upload. For now I don't have anything to update, but we will post the results of the investigation here. Thanks again for your help! Best regards, PS: I also edited the ticket title to indicate that there is no actual data loss | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 11/Mar/19 ] | ||||||||||||||||||||||
|
There is nothing specifically sensitive in this data, so I uploaded it at -https://drive-public-eu.s3.amazonaws.com/mongodb/server-40061-config.zip- (10MB). Still, I would like to not to keep it up longer than needed, so please tell me when you no longer need it to be online. | ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 11/Mar/19 ] | ||||||||||||||||||||||
|
Thank you very much for the prompt response. From the message, this looks like a duplicate key that was generated in the ns, min index, not the _id. Would it be possible to attach a dump of your config database so I can recreate the cluster (without data) and try the move command that you are issuing? If attaching it to this ticket poses any concern on your side, we can provide a secure upload portal to which only MongoDB employees have access. | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 11/Mar/19 ] | ||||||||||||||||||||||
|
I'd also like to note that this problem reproduces in 100% of the cases for our cluster that was upgraded from v3.4 – all chunks that have UUID field in "min" key fail at migration commit every time. I believe splits are failing too, because I don't see any splits in logs after upgrade (although it might be I'm looking in the wrong place since IIRC v3.6 moved a lot of chunk management onto shards themselves). | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 11/Mar/19 ] | ||||||||||||||||||||||
|
As for my data loss concerns, the Balancer::_splitOrMarkJumbo will probably always silently fail marking chunks as jumbo (no data loss, yet suboptimal) and ShardingCatalogManager::commitChunkSplit will probably never be able to split chunk (not sure how the failure will be handled since it's an upsert). | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 11/Mar/19 ] | ||||||||||||||||||||||
|
No, I'm not using the internal command, I'm issuing the sh.moveChunk command from shell (or the "moveChunk" command from our C# application). Consider the following log from one of the shard primaries:
| ||||||||||||||||||||||
| Comment by Kaloian Manassiev [ 11/Mar/19 ] | ||||||||||||||||||||||
|
Hi onyxmaster, Thank you very much for your report and concern. First I would like to alleviate your concerns about data loss: The _id field in the config.chunks collection is not used for routing of queries and/or updates, but only for generating chunk modifications internally ( For the DuplicateKey error - I notice that the application name for the _configsvrMoveChunk command which encountered it is "MongoDB Shell". Are you issuing this command directly against the config server? Because this is an internal command which should not be used directly, and instead moveChunk against MongoS should be used. Best regards, | ||||||||||||||||||||||
| Comment by Aristarkh Zagorodnikov [ 11/Mar/19 ] | ||||||||||||||||||||||
|
The "changed" link in the description is wrong, it should be this one. |