[SERVER-30217] applyOps doesn't wait for replication on the last op if it's a noop Created: 18/Jul/17  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Backlog - Replication Team
Resolution: Unresolved Votes: 0
Labels: former-quick-wins, gm-ack, neweng, writeconcern
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-39364 Audit uses of setLastOpToSystemLastOp... Closed
Assigned Teams:
Replication
Operating System: ALL
Participants:

 Description   

When a write concern is provided to the applyOps command, we normally wait on the OpTime of whichever operation successfully completed last. This is erroneous, however, if the last operation in the array happens to be a write no-op and thus isn’t assigned an OpTime. Let the second to last operation in the applyOps be write A, the last operation in applyOps be write B. Let B do a no-op write and let the operation that caused B to be a no-op be C. If C has an OpTime ahead of A, then we won’t wait for C to be replicated and it could be rolled back, even though B was acknowledged. To fix this, we should wait for replication of the node’s last applied OpTime if the last write operation was a no-op write.



 Comments   
Comment by Gregory McKeon (Inactive) [ 19/Jun/18 ]

If we fix any applyOps correctness bugs, we want to fix this one.

Comment by Chibuikem Amaechi [ 01/Jan/18 ]

Still wrapping my head around this, but if this issue is only related to the non-atomic form of applyOps, which I suspect is _applyOps() in src/mongo/db/repl/apply_ops.cpp, then I suppose the first step in resolving this issue would be to prevent _applyOps() from ignoring no-op write operations by removing the following fragment of code:

const char* opType = opObj["op"].valuestrsafe();
if (*opType == 'n')
    continue;

I would then proceed cautiously by adding the following block to the lambda expression passed to writeConflictRetry():

{
    repl::UnreplicatedWritesBlock uwb(opCtx);
    uassertStatusOK(_applyOps(opCtx,
                              dbName,
                              applyOpCmd,
                              oplogApplicationMode,
                              &result,
                              &numApplied,
                              opsBuilder.get()));
}

I believe the first line of code in the above block would suppress replication for non-atomic operations until the last successfully completed operation in the array. In other words, it would wait for replication of the last op, even if it's a no-op write.

Not sure if any of this even makes sense, but this is as far as I've gotten .

Please share your thoughts!

Comment by Spencer Brody (Inactive) [ 14/Dec/17 ]

This only applies to the non-atomic form of applyOps

Generated at Thu Feb 08 04:23:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.