[SERVER-30957] Causal consistency can be broken by migration Created: 05/Sep/17 Updated: 27/Oct/23 Resolved: 27/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Misha Tyulenev | Assignee: | Misha Tyulenev |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
The configuration The scenario: Then the balancer initiates migration of the data from S1 to S2 that includes the written by the write. Client issues a read with (afterClusterTime = T1, readPreference=secondary) In this case the read will return the result that will not be causally consistent with the write as it will not have the written values. One possible solution to the issue will be modifying the afterClusterTime on the router if the requested time is less than the routing information change for the requested data. Implementation details: 2. inspect all incoming messages that have afterClusterTime and if there is a chance that the requested data has been moved then update afterClusterTime to the operationTime of the routing metadata refresh. |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 27/Sep/17 ] |
|
I created |
| Comment by Dianna Hohensee (Inactive) [ 26/Sep/17 ] |
|
Reading the ticket summary, the scenario described is indeed safe because the shard version protocol extends to secondaries. I had a different scenario in mind for causal consistency being broken than the one you're describing, however, so I'm going to have to revisit my notes/scribbles. |
| Comment by Esha Maharishi (Inactive) [ 26/Sep/17 ] |
|
I believe misha.tyulenev's comment is correct: if the read is versioned, the secondary will wait until it has received the routing table updates that correspond to the fresh version, at which point it must have already received the data that corresponds to the fresh version. I think there is only one known unversioned read, which is geoNear. The safe_secondary_reads_drop_recreate.js test specifies what kind of behavior (versioned, unversioned, or unsharded only) each read command has. |
| Comment by Misha Tyulenev [ 26/Sep/17 ] |
|
An update based on the offline discussion with redbeard0531 dianna.hohensee and esha.maharishi |