[SERVER-6672] slaveDelay Setting Causes Replica Ops to be Applied in Batches at approximately the slaveDelay Interval Created: 31/Jul/12 Updated: 11/Jul/16 Resolved: 27/Aug/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.2.0-rc0 |
| Fix Version/s: | 2.2.1, 2.3.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adam Comerford | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux, 2.2.0-rc0 - Sharded, 11 clusters |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Description |
|
When a slave delay is specified ("slaveDelay": 7200 in the original case), the replication ops are applied in batches at approximately the 7200 second interval. As a result, there are massive write spikes (insert/updates), lock percentage spikes, disk IO spikes and DR102 errors caused |
| Comments |
| Comment by auto [ 12/Sep/12 ] |
|
Author: {u'date': u'2012-08-23T13:45:30-07:00', u'email': u'randolph@10gen.com', u'name': u'Randolph Tan'}Message: Fix for Added logic in the oplog application batching algorithm to end the batch early if the we see an op that is too new to be applied with respect to the slaveDelay. |
| Comment by auto [ 27/Aug/12 ] |
|
Author: {u'date': u'2012-08-23T13:45:30-07:00', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: Fix for Added logic in the oplog application batching algorithm to end the batch early if the we see an op that is too new to be applied with respect to the slaveDelay. |
| Comment by Randolph Tan [ 23/Aug/12 ] |
|
Results for running the test scripts in my local machine before fix: Delay hovers around 26~47 sec Results for running the test scripts in my local machine after fix: Delay hovers around 12~19 sec |