[SERVER-63363] Do not assume that the ChunkVersion always has a valid timestamp Created: 07/Feb/22 Updated: 18/Jan/24 Resolved: 31/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sergi Mateo Bellido | Assignee: | Kaloian Manassiev |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07, Sharding EMEA 2022-03-21, Sharding EMEA 2022-04-04 | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
In 5.0 we extended the ChunkVersion with a new field: a timestamp. The idea was that this timestamp would replace another field: the epoch. As part of the setFCV(5.0) command, we patched up the information on the CSRS and asked all shards to refresh, patching up their routing and filtering information. However, we didn't think about what happened with mongos: it could totally happen that we had a mongos under FCV 5.0 sending commands with shardVersions that didn't have timestamps. However that was not a problem because the ChunkVersion class wasn't assuming that the timestamp was mandatory, so if it wasn't present that class was relying on the epoch field. The problem was introduced in 5.1, when we changed the implementation of the ChunkVersion to assume that the Timestamp was always present. The problematic scenario is when we are replacing the binaries: first we change the ones on the shards and afterwards the ones on the mongoses. Thus, it could happen that one of those stale mongoses send a command with a version that doesn't have timestamps to a shard that it is expecting them. The goal of this task is to make the timestamp optional again. A symptom of this bug are errors with the following message: "Invalid type missing for version timestamp part". Flushing the routers or restarting the binary should fix this problem. |
| Comments |
| Comment by Githook User [ 31/Mar/22 ] |
|
Author: {'name': 'Sergi Mateo Bellido', 'email': 'sergi.mateo-bellido@mongodb.com', 'username': 'smateo'}Message: Revert " This reverts commit e98dc4a89c55c0c391b1a6d1ef6a4be92328cfe9. Looking for a different way of fixing this problem! |
| Comment by Githook User [ 31/Mar/22 ] |
|
Author: {'name': 'Sergi Mateo Bellido', 'email': 'sergi.mateo-bellido@mongodb.com', 'username': 'smateo'}Message: Revert " This reverts commit e8942de372206059a8acf80a1a17b2d4b02551a2. (looking for another way to fix this problem) |
| Comment by Githook User [ 29/Mar/22 ] |
|
Author: {'name': 'Sergi Mateo Bellido', 'email': 'sergi.mateo-bellido@mongodb.com', 'username': 'smateo'}Message: Backport to 5.2 |
| Comment by Githook User [ 23/Mar/22 ] |
|
Author: {'name': 'Sergi Mateo Bellido', 'email': 'sergi.mateo-bellido@mongodb.com', 'username': 'smateo'}Message: |