Currently the opeartionTime computed in Command::run() is a pessimization that returns the current clusterTime as an operationTime and that may make the following client's requests with readAfterClusterTime set to this value wait when it should not to.
The correct implementation should return the operationTime that is specific for the operation time and read/writeConcern.
for the write operation:
should be
repl::ReplClientInfo::forClient(opCtx->getClient()).getLastOp().getTimestamp());
read with readConcern level majority: operationTime is the committed LogicalTime_LOG.
auto replCoord = ReplicationCoordinator::get(opCtx); auto lastAppliedOpTime = replCoord->getLastCommittidOpTime();
read with readConcern level local: operationTime is the local LogicalTime_LOG.
auto replCoord = ReplicationCoordinator::get(opCtx); auto lastAppliedOpTime = replCoord->getMyLastAppliedOpTime();
The last question is how to differ "read" vs "write" operation:
so if the ReplClientInfo getLastOp() called before and after the command is run are same its considered a "read" otherwise its a write.
The change in Command::run should
a) get the startOperationTime (its already done)
b) extract from the read or write concern.
majority = readConcern:level majority or writeConcern:w majority
everything else is considered local. I think its ok to count w:0 as local as well.
c) once the operation is finished pass the computed level and the startOperationTime to implement the logic specified above.