[SERVER-19816] Oplog replication throughput is bounded by network latency Created: 07/Aug/15 Updated: 06/Dec/22 Resolved: 03/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.6.11, 3.0.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andrew Ryder (Inactive) | Assignee: | Backlog - Replication Team |
| Resolution: | Done | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | This probably depends on getting exhaust support for both find/getmore commands and makign it available to replication |
||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
Assuming a constant stream of operations totalling 50MB/s data (eg. insertions) to a replica-set, any Secondary that has more than ~40ms network latency (~80ms ping) to its next nearest member will be unable to keep up regardless of the network bandwidth between them. If the ping time between any two members of a replica set is known, the upper limit for replication throughput in MB/s between those two members can be calculated with the following equation:
The limit arises because the tailable cursor retrieves at max 4MB per roundtrip but the request/reply is serial. Thus, the longer a roundtrip (ping) takes, the lower the throughput. |
| Comments |
| Comment by Judah Schvimer [ 03/Jan/20 ] |
|
This is being fixed by (PM-1232) Use Exhaust Cursors for Oplog Fetching. |
| Comment by Eric Milkie [ 21/Dec/15 ] |
|
We're planning on raising the limit to 16MB soon, which will greatly improve things – |