[SERVER-50215] _configsvrCommitChunkMigration might return before changes are majority committed Created: 10/Aug/20 Updated: 29/Oct/23 Resolved: 20/Aug/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marcos José Grillo Ramirez | Assignee: | Tommaso Tocci |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | PM-1645-Milestone-1, sharding-csrs-stepdown-only | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: |
|
||||||||||||
| Sprint: | Sharding 2020-08-10, Sharding 2020-08-24 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 40 | ||||||||||||
| Description |
|
SERVER-49147 transformed the _configsvrCommitChunkMigration to make it idempotent so a retry would not generate errors, however, it added the possibility to return with two (1 2) reads with local read concern on a command that has a majority written write concern. The result of this is that if a successful commit returns an error (like for example, if the config server steps down right after successfully written the changes) and the command is retried, it will return before the commit is replicated, which might cause a race between the subsequent refresh and the replication. This might cause an invalid state in which the shard does not see it's own change, and will prevent the range deletion task to execute, leaving garbage data on the shard, and serving data which no longer owns. |
| Comments |
| Comment by Githook User [ 20/Aug/20 ] |
|
Author: {'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@mongodb.com', 'username': 'toto-dev'}Message: |