[SERVER-47695] Write commands run by threads that can survive rollback can fail operationTime invariant in ServiceEntryPoint Created: 22/Apr/20  Updated: 29/Oct/23  Resolved: 04/May/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.19, 4.2.7, 3.6.19, 4.4.0-rc4, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Lingzhi Deng
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-30842 Don't try to set last optime for clie... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0, v3.6
Sprint: Repl 2020-05-18
Participants:
Linked BF Score: 8

 Description   

If a thread does a write that gets rolled back during stepdown, its client can have _lastOp with timestamp higher than the timestamp of system last opTime (if the wallclock on primary is behind the wallclock on the node the threads runs on). So if after stepdown the thread sends a write command to itself, the command will fail the ReplicationCoordinator check when trying to write an oplog entry but the NotMaster error will get caught in this block in the ServiceEntryPoint::runCommandImpl. Since the command is a noop, the client's lastOp will be set to last system opTime. So after the wait for writeConcern fails, the NotMaster error will get propagated up, and the operation will hit the invariant operationTime >= startOperations when trying  append operationTime to the response.



 Comments   
Comment by Githook User [ 27/May/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47695: Don't set lastOp for client backwards in terms of timestamp after rollback

(cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406)
(cherry picked from commit bf3227e11dd689044ff4555c823c682899f41cf9)
Branch: v3.6
https://github.com/mongodb/mongo/commit/d7ad6f6c417e711218729f2faee7fc59a19a5e5d

Comment by Githook User [ 27/May/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47695: Don't set lastOp for client backwards in terms of timestamp after rollback

(cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406)
(cherry picked from commit bf3227e11dd689044ff4555c823c682899f41cf9)
Branch: v4.0
https://github.com/mongodb/mongo/commit/2794691b14a8dedc25b136a03fe2e89fa9c8fd6c

Comment by Githook User [ 06/May/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47695: Don't set lastOp for client backwards in terms of timestamp after rollback

(cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406)
Branch: v4.2
https://github.com/mongodb/mongo/commit/bf3227e11dd689044ff4555c823c682899f41cf9

Comment by Githook User [ 06/May/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47695: Don't set lastOp for client backwards in terms of timestamp after rollback

(cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406)
Branch: v4.4
https://github.com/mongodb/mongo/commit/5fdc82329bb92d7887e6b6b57725e52d668b1823

Comment by Githook User [ 04/May/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47695: Don't set lastOp for client backwards in terms of timestamp after rollback
Branch: master
https://github.com/mongodb/mongo/commit/bd579c0d3f2583c2af7dcd98c7f6cfc55009b406

Comment by Lingzhi Deng [ 22/Apr/20 ]

Ok, so it seems that the problem was because the invariant only compares the timestamp not OpTime.

Comment by Lingzhi Deng [ 22/Apr/20 ]

Since the command is a noop, the client's lastOp will be set to last system opTime.

But we already have logic in setLastOpToSystemLastOpTime that handles rollback and prevents the _lastOp from going backwards. So I don't see how it would hit the invariant. Can you elaborate?

Generated at Thu Feb 08 05:14:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.