[SERVER-51903] Avoid VectorClockPersistCommand deadlock upon fast replica state transitions Created: 30/Oct/20  Updated: 29/Oct/23  Resolved: 24/Nov/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Pierlauro Sciarelli Assignee: Pierlauro Sciarelli
Resolution: Fixed Votes: 0
Labels: PM-1645-Milestone-3, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2020-11-02, Sharding 2020-11-16, Sharding 2020-11-30
Participants:
Linked BF Score: 16

 Description   

When the vector clock needs to be persisted by a secondary, waitForDurable ends up calling _vectorClockPersist on the primary and blocks waiting for the reply. However, if the same node steps up right before performing the call, there will be a deadlock because waitForDurable will wait on its own reply.

Thanks to the timeout, the system doesn't hang forever but the vector clock is not persisted and casual consistency is therefore not ensured.

Moreover, VectorClockPersistCommand::supportsWriteConcern is currently true, meaning that the command is required to be invoked with a write concern field, however it should be set to false because:
1) It is not necessary, because the VectorClockDocument is always written with majority write concern.
2) It is wrong, because the command fails on this uassert since requests are always received from internal clients (always from secondaries).



 Comments   
Comment by Githook User [ 24/Nov/20 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-51903 Avoid VectorClockPersistCommand deadlock upon fast replica state transitions
Branch: master
https://github.com/mongodb/mongo/commit/045dd7a5857aa1b34d6fd1d2f6e3e29900010749

Comment by Pierlauro Sciarelli [ 24/Nov/20 ]

As there is currently no use case for secondaries persisting the VectorClock, it has been decided to throw out the VectorClockPersist command.

Generated at Thu Feb 08 05:26:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.