[SERVER-47695] Write commands run by threads that can survive rollback can fail operationTime invariant in ServiceEntryPoint Created: 22/Apr/20 Updated: 29/Oct/23 Resolved: 04/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.19, 4.2.7, 3.6.19, 4.4.0-rc4, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Cheahuychou Mao | Assignee: | Lingzhi Deng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v4.4, v4.2, v4.0, v3.6
|
||||||||||||||||
| Sprint: | Repl 2020-05-18 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 8 | ||||||||||||||||
| Description |
|
If a thread does a write that gets rolled back during stepdown, its client can have _lastOp with timestamp higher than the timestamp of system last opTime (if the wallclock on primary is behind the wallclock on the node the threads runs on). So if after stepdown the thread sends a write command to itself, the command will fail the ReplicationCoordinator check when trying to write an oplog entry but the NotMaster error will get caught in this block in the ServiceEntryPoint::runCommandImpl. Since the command is a noop, the client's lastOp will be set to last system opTime. So after the wait for writeConcern fails, the NotMaster error will get propagated up, and the operation will hit the invariant operationTime >= startOperations when trying append operationTime to the response. |
| Comments |
| Comment by Githook User [ 27/May/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: (cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406) |
| Comment by Githook User [ 27/May/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: (cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406) |
| Comment by Githook User [ 06/May/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: (cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406) |
| Comment by Githook User [ 06/May/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: (cherry picked from commit bd579c0d3f2583c2af7dcd98c7f6cfc55009b406) |
| Comment by Githook User [ 04/May/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: |
| Comment by Lingzhi Deng [ 22/Apr/20 ] |
|
Ok, so it seems that the problem was because the invariant only compares the timestamp not OpTime. |
| Comment by Lingzhi Deng [ 22/Apr/20 ] |
But we already have logic in setLastOpToSystemLastOpTime that handles rollback and prevents the _lastOp from going backwards. So I don't see how it would hit the invariant. Can you elaborate? |