[SERVER-23978] do not send updatePosition twice for durable ops Created: 28/Apr/16  Updated: 29/Apr/16  Resolved: 29/Apr/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Eric Milkie Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We currently trigger two updatePositions when an op is journaled; the journalListener callback does one, and the applyBatchFinalizer does the other one once waitForDurable() returns. We should only do one of these.



 Comments   
Comment by Eric Milkie [ 29/Apr/16 ]

You're correct; I was seeing the keepalive timer firing.

Comment by Scott Hernandez (Inactive) [ 28/Apr/16 ]

There should be no harm or double notification because they both call setMyLastDurableOpTimeForward which will only update and report progress if the optime is newer.

void ReplicationCoordinatorImpl::setMyLastDurableOpTimeForward(const OpTime& opTime) {
    stdx::unique_lock<stdx::mutex> lock(_mutex);
    if (opTime > _getMyLastDurableOpTime_inlock()) {
        _setMyLastDurableOpTime_inlock(opTime, false);
        _reportUpstream_inlock(std::move(lock));
    }
}

If we did want to only call once we can remove the call from the applyBatchFinalizer, since it only applies to secondaries whereas the callback happens at the storage engine level for all nodes.

Maybe we are seeing something else going on? Like the apply/durable optime being update sep. (which is expected) but in quick succession – or there is some other issue sending progress like with the timer.

Generated at Thu Feb 08 04:05:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.