[SERVER-67434] Improve Sync Source Selection with Chained Replication and Flow Control Created: 22/Jun/22 Updated: 01/Nov/23 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 5.2.0, 5.3.1, 5.2.1, 5.0.9, 4.4.15, 4.2.21 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Diego Rodriguez (Inactive) | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | former-quick-wins, replication | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Replication
|
| Participants: | |
| Case: | (copied to CRM) |
| Description |
|
Hi Team, Starting in MongoDB v4.2 the Flow Control Mechanism was introduced in order to limit the rate at which the primary applies its writes with the goal of keeping the majority committed lag under a configurable maximum value of flowControlTargetLagSeconds. At the same time, and whenever replication chaining is enabled, the sync source of a secondary will be changed if the most recent OpTime of the sync source is more than maxSyncSourceLagSecs seconds behind another member's latest oplog entry. This ensures that the sync source is not too far behind other nodes in the set. maxSyncSourceLagSecs is a server parameter and has a default value of 30 seconds. The problem is that the value of maxSyncSourceLagSecs is bigger (3x) than the default value of 10 seconds for flowControlTargetLagSeconds and that can result in primary nodes being throttled by the Flow Control mechanism just because one secondary lags behind while enough secondary nodes to make up a majority also replicate from it. Imagine the following scenario:
If MongoDB were to consider the interplay between maxSyncSourceLagSecs and flowControlTargetLagSeconds in enviornments with chained replication enabled and revaluate its sync source before hitting flowControlTargetLagSeconds (or maybe shortly after?), then situations like the above would be avoided. Some options I thought of:
Regards |
| Comments |
| Comment by Diego Rodriguez (Inactive) [ 22/Aug/22 ] |
|
Hi daniel.gottlieb@mongodb.com, The disadvantage I see with that approach is that we act once the problem is already there: a majority of your nodes are lagging and flow control is already engaged and throttling writes. By propagating the flow control configuration you can directly avoid engaging flow control in scenarios like the one above by telling your Secondaries to re-evaluate the sync source if the lag against the source is about to get close to flowControlTargetLagSeconds. |
| Comment by Daniel Gottlieb (Inactive) [ 27/Jun/22 ] |
|
Maybe a simpler alternative than having a primary propagate its flow control configuration is to instead propagate its state, i.e: "I am currently throttling due to flow control". And using that information to hint to chained secondaries to change their sync source. |