[SERVER-51903] Avoid VectorClockPersistCommand deadlock upon fast replica state transitions Created: 30/Oct/20 Updated: 29/Oct/23 Resolved: 24/Nov/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pierlauro Sciarelli | Assignee: | Pierlauro Sciarelli |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | PM-1645-Milestone-3, sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Sprint: | Sharding 2020-11-02, Sharding 2020-11-16, Sharding 2020-11-30 | ||||
| Participants: | |||||
| Linked BF Score: | 16 | ||||
| Description |
|
When the vector clock needs to be persisted by a secondary, waitForDurable ends up calling _vectorClockPersist on the primary and blocks waiting for the reply. However, if the same node steps up right before performing the call, there will be a deadlock because waitForDurable will wait on its own reply. Thanks to the timeout, the system doesn't hang forever but the vector clock is not persisted and casual consistency is therefore not ensured. Moreover, VectorClockPersistCommand::supportsWriteConcern is currently true, meaning that the command is required to be invoked with a write concern field, however it should be set to false because: |
| Comments |
| Comment by Githook User [ 24/Nov/20 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Pierlauro Sciarelli [ 24/Nov/20 ] |
|
As there is currently no use case for secondaries persisting the VectorClock, it has been decided to throw out the VectorClockPersist command. |