[DOCS-15012] SERVER-62065: Upgrade can leave chunk entries without history on the shards Created: 05/Jan/22 Updated: 22/Jan/24 |
|
| Status: | Backlog |
| Project: | Documentation |
| Component/s: | manual, Server |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.6, 5.3.0, 4.2.19, 5.2.1, 4.4.13, 4.0.29 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Backlog - Core Eng Program Management Team | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | backlog, feature, sharding | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Days since reply: | 2 years, 5 weeks ago | ||||||||||||||||||||||||||||
| Epic Link: | DOCSP-19447 | ||||||||||||||||||||||||||||
| Description |
|
Downstream Change Summary This change has introduced a new command called `repairShardedCollectionChunksHistory` to counteract the effects of the bug described in this ticket. More about the operation of the command is available in its help option: https://github.com/mongodb/mongo/blob/3b56acfe78e91b607eafc737ebf88d237db1460a/src/mongo/s/commands/cluster_repair_sharded_collection_chunks_history_cmd.cpp#L65 The command is under the `splitChunk` privilege so there shouldn't be any need for changes to Atlas. Description of Linked TicketIn 4.0 we introduced support for a multi-version routing table (PM-1013). This later (in 4.2) became the basis for distributed transactions and snapshot reads. As part of this project, we introduced a history field to the persisted chunk type, but we never actually used it in 4.0. Since we never used it, the backup/restore procedures ever since 4.0 have referenced that this field is safe do delete on restore. However, it is actually not safe to do so, because it breaks snapshot reads (routing and filtering at a point in time). Furthermore, due to the optimisation done under As a result of the above, we can have 4.0, 4.2, 4.4, 5.0, 5.1, 5.2 clusters which are missing the history fields for some chunks, which in turn breaks snapshot reads and distributed transactions, which will fail with an error saying Chunk has no history entries. These clusters will not have any issue functioning, until customers start using distributed transactions and snapshot reads. This ticket is to provide manual procedure for restoring the history fields and to implement a command, which will restore the history fields automatically. |
| Comments |
| Comment by Ian Fogelman [ 05/Jan/22 ] |
|
This new command will need to be adding to master and back ported to all branches. |
| Comment by PM Bot [ 05/Jan/22 ] |
|
Downstream changes updated for upstream More about the operation of the command is available in its help option: https://github.com/mongodb/mongo/blob/3b56acfe78e91b607eafc737ebf88d237db1460a/src/mongo/s/commands/cluster_repair_sharded_collection_chunks_history_cmd.cpp#L65 The command is under the `splitChunk` privilege so there shouldn't be any need for changes to Atlas. |