[SERVER-70162] moveChunk and shardCollection commands are not properly synchronized Created: 03/Oct/22 Updated: 05/Dec/22 Resolved: 04/Oct/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Antonio Fuschetto | Assignee: | [DO NOT USE] Backlog - Sharding EMEA |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Sharding EMEA
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Linked BF Score: | 127 | ||||
| Description |
|
The current implementation of the shardCollection command enters the critical section, creates the configuration entry to the config server, then leaves the critical section. Between the collections creation and release of the critical section events, the newly created collection could be taken into account by the balancer (moveChunk command) which would not be able to enter the critical section. A possible solution consists in creating the collection entry with the allowMigrations flag set, then unset it with the _configsvrSetAllowMigrations once the critical section is left. Note this flag must be unset in a phase of the shardCollection that is resilient to stepdown/shutdown. |
| Comments |
| Comment by Antonio Fuschetto [ 04/Oct/22 ] |
|
The moveChunk is serialized with the critical section, so there is no race with the shardCollection. |