[SERVER-12151] Improve slaveDelay behavior during transition from initial sync to steady state Created: 17/Dec/13 Updated: 06/Dec/22 Resolved: 03/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Matt Dannenberg | Assignee: | Backlog - Replication Team |
| Resolution: | Done | Votes: | 0 |
| Labels: | PM248 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
After performing an initial sync, a node will transition to secondary/primary. Transitioning to a readable state (secondary/primary/etc) does not observe/respect the slaveDelay which may lead to user confusion if the node is queried with the expectation that the data is not newer than the expected delay. Due to the way replication works (that we are copying live data during initial sync, which may be after slaveDelay line) we cannot get to a consistent and valid state without current data. |
| Comments |
| Comment by Spencer Brody (Inactive) [ 03/May/18 ] |
|
The goal of initial sync is to get you joined into the cluster at a consistent state. I think if the moment you come out of initial sync you are more caught up than your slaveDelay, that's fine. The point of slaveDelay is to have a node to query against in the case of user error corrupting the primary, but the slaveDelayed node is useless for that if it's not consistent. I think becoming secondary and only then delaying oplog application is the desired behavior. |