[SERVER-54167] configOpTime can become ahead of VectorClock::clusterTime Created: 01/Feb/21  Updated: 29/Oct/23  Resolved: 04/Feb/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Jordi Serra Torrens
Resolution: Fixed Votes: 0
Labels: PM-1965-Milestone-0-Metadata-Format
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-53105 Remove namespace field from config.ch... Closed
Related
related to SERVER-54281 VectorClock's configTime can become g... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Run the sharding suite with the changes in this patch that adds an invariant:

 

invariant(configTime.getTimestamp() <= currentTime.clusterTime().asTimestamp());

in Grid.cpp and check that it gets tripped.

 

 

Sprint: Sharding 2021-02-08
Participants:
Linked BF Score: 45

 Description   

The configOpTime (Grid::configOpTime()) can get ahead of the Vector's Clock clusterTime following the this sequence of events:

  1. Node A is sending a request to node B.
  2. In A, when preparing the request metadata, it will first get the vector clock components and add them to the request.
  3. Concurrently, another thread in A bumps the clocks (vector clock & configOpTime). Now configOpTime is greater than the clusterTime that we read at point 2
  4. A adds the configOpTime to the request metadata (which because of point 3, is greater than the clusterTime it wrote at 2
  5. The request gets sent to node B, which updates it's clock with the received times. B is left with a configOpTime greter than VectorClock::clusterTime
  6. When something later calls Grid::configOpTime, the invariant is tripped.

The order in which the metadata egress hooks are called is set here for the mongos and here for mongod. Changing the hooks order so that the configOpTime hook runs before the vectorClock hook would not solve the issue, because of the fact that the hooks are run on the same order when reading reply metadata. If we did so, then the configOpTime would get advanced before VectorClock, leading to the same situation.



 Comments   
Comment by Githook User [ 04/Feb/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-54167: Advance VectorClock on configOpTime advancement
Branch: master
https://github.com/mongodb/mongo/commit/c953fc22a697156ae283485daff529ef74b08aff

Generated at Thu Feb 08 05:32:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.