[SERVER-28632] couldn't move chunk when doing shardCollection with hashed sharding key Created: 05/Apr/17 Updated: 27/Oct/23 Resolved: 17/Apr/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.0.9 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Martin Wu | Assignee: | Mark Agarunov |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
when I do sh.shardCollection with one collection on many process at a same time. It almost happens. How to:
then just execute test_shardColl.sh:
when you setLogLevel(5, "sharding") , you can see logs like this:
I suppose that something wrong with ShardCollectionCmd::run() on mongo/s/commands_admin.cpp:
|
| Comments |
| Comment by Martin Wu [ 24/Apr/17 ] | |
|
Hello mark.agarunov, OK, Thank you for your response. That's clear. Martin | |
| Comment by Mark Agarunov [ 14/Apr/17 ] | |
|
Hello coolxwu, The driver is returning OK because the collection is successfully sharded. The error message is related to the balancing of chunks across the shards, but the collection itself is properly sharded, so the shardCollection command is successful even if the balancing/chunk migration is not. Additionally, note that chunk migrations do not take a global lock. Thanks, | |
| Comment by Martin Wu [ 14/Apr/17 ] | |
|
Hello mark.agarunov You got that. And then, My point is that shardCollection SHOULD NOT be called successful before the previous shardCollection command has completed with global lock. Because when the python driver return "OK" after "shardCollection", I think that everything is OK. In fact, it is not. Thanks. | |
| Comment by Mark Agarunov [ 06/Apr/17 ] | |
|
Hello coolxwu, I may be misunderstanding the behavior, my apologies. From what I can see looking at your script and output, it essentially causes multiple shardCollection commands to be executed on the same collection in parallel. As the initial chunk migration is not instantaneous, it appears that the error you're seeing is due to a shardCollection command being called on a collection before the previous shardCollection command has completed. If I am missing something in my understanding, please let me know. Thanks, | |
| Comment by Martin Wu [ 06/Apr/17 ] | |
|
Hi mark.agarunov , Thanks. | |
| Comment by Mark Agarunov [ 05/Apr/17 ] | |
|
Hello coolxwu, Thank you for the report. Looking at the output you've provided, it appears that you are seeing this error because the chunk is still being moved due to the previously issued command. According to the logs:
Thanks, |