[SERVER-58730] Implement a barrier for disallowing direct writes to shards Created: 21/Jul/21 Updated: 25/Apr/23 Resolved: 25/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | [DO NOT USE] Backlog - Sharding EMEA |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||||||||||
| Sprint: | Sharding EMEA 2023-05-01 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Due to the current architecture of sharding, all write operations in a sharded cluster rely on the shard versioning protocol as a way to ensure that writes are going to the shard where they are supposed to be. This also has a nice side benefit that we don't have to incur performance penalty for checking whether the shard owns the write. Because of the above, direct writes to a shard have the potential of corrupting data, because they don't observe the shard versioning protocol, so we should disallow them. However, in order to allow customers to transition from a replica set to a sharded cluster without incurring downtime for their applications, there must be a window of time when direct writes to the shard are permitted. This ticket is to implement some form of barrier after which we disallow direct writes to any shard in a sharded cluster. The barrier doesn't need to be very granular and it should be acceptable such a barrier to be "the first time someone calls shardCollection or addShard". |
| Comments |
| Comment by Bruce Lucas (Inactive) [ 21/Jul/21 ] |
|
Seems plausible. |
| Comment by Kaloian Manassiev [ 21/Jul/21 ] |
|
For emergency maintentance, backup/restore, etc, the restriction would be lifted by the fact that such nodes will not have --shardsvr when started. For testing, that's a good point because we have tests which write directly, so probably --testComandsEnabled should be taken into account? |
| Comment by Bruce Lucas (Inactive) [ 21/Jul/21 ] |
|
I imagine there would need to be some way to override the restriction for testing, debugging, and emergency maintenance. |