[SERVER-30797] Shard primaries must commit a majority write before using updated chunk routing tables Created: 23/Aug/17 Updated: 30/Oct/23 Resolved: 03/Oct/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andy Schwerin | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Sharding 2017-10-02, Sharding 2017-10-23 | ||||||||
| Participants: | |||||||||
| Description |
|
After receiving a chunk routing table change and before putting it to use, a shard primary needs to confirm that it was the primary at the time it received the change. Otherwise, it may use a version of the routing table inconsistent with the data that it stores, potentially leading it to return orphans or fail to return results at read concerns "local" and stronger. – Details – Further, if the old primary persists the routing table updates, any secondaries on the same side of the network partition can also exhibit this incorrect behavior. |
| Comments |
| Comment by Githook User [ 03/Oct/17 ] |
|
Author: {'email': 'dianna.hohensee@10gen.com', 'name': 'Dianna Hohensee', 'username': 'DiannaHohensee'}Message: |
| Comment by Andy Schwerin [ 25/Sep/17 ] |
|
Only the new primary will be able to complete majority writes, and only nodes in the same partition as the new primary will be able to provide causally consistent majority reads following those writes. However, this fix is about ensuring that local and majority read concern only return documents as they at some point existed (majority) or were proposed to exist (local). Without it, those read concerns may erroneously return orphans, which can contain changes never proposed by a client application, albeit only in scenarios like the one described. |
| Comment by Dianna Hohensee (Inactive) [ 25/Sep/17 ] |
|
To be clear, it seems like this scenario will still be pretty broken because writes (local or majority) could happen to one side of a split, then reads (local or majority) to the other side won't see the changes. Is that alright? schwerin |