[SERVER-23892] Do periodic replicated writes every 10 seconds while idle (for maxStalenessMS) Created: 24/Apr/16 Updated: 03/Mar/21 Resolved: 27/Aug/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.3.12 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | Misha Tyulenev |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Sprint: | Sharding 18 (08/05/16), Sharding 2016-08-29, Sharding 2016-09-19 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||
| Description |
|
For the upcoming maxStalenessMS read preference it is required that periodically there needs to be a replicated write - so it would have a new oplog entry to move forward the OpTime reported to clients. These no-op 'n' writes should only be done on the primary when no other replicated writes have been done in the timeout period. A side-effect of this will be that an idle replica set will no longer be idle with respect to writes, since it will internally force a write periodically. This new behavior (which also exists in master-slave replication) will show up in things like the calculation of "replication lag" never exceeding a second or two on caught up nodes – previously a lack of writes for a few minutes would immediately show a lag of that period as the new write comes in until it is replicated. |
| Comments |
| Comment by Githook User [ 27/Aug/16 ] |
|
Author: {u'username': u'mikety', u'name': u'Misha Tyulenev', u'email': u'misha@mongodb.com'}Message: |
| Comment by A. Jesse Jiryu Davis [ 22/Jul/16 ] |
|
The original justification for this feature is here: |
| Comment by Misha Tyulenev [ 22/Jul/16 ] |
|
Here are my thoughts on why we should not implement periodic writes: so if there are no writes to the primary the staleness estimation still will be as accurate as with periodic writes. On the other side its possible to imagine the scenario when secondary is stale but not detectable because there is no writes. I think this is a corner case that will not happen in production. Also the first primary write would help detecting this issue and therefor all the consequent requests will satisfy the maxStaleness. |
| Comment by A. Jesse Jiryu Davis [ 02/May/16 ] |
|
Sounds good to me. The default client heartbeat is 10 seconds, so writing a no-op every 10 seconds won't harm the precision of staleness calculations. |
| Comment by Eric Milkie [ 02/May/16 ] |
|
I think doing one write every 10 seconds during idle periods is a good compromise between resource consumption and idle time "spikes". |
| Comment by Eric Milkie [ 25/Apr/16 ] |
|
Also, it should not log a no-op if anything was logged in the previous period. That is, only idle (writing) primaries should be writing no-ops. |