[SERVER-62907] Vector clock components must survive CSRS non-rolling restart Created: 24/Jan/22  Updated: 29/Oct/23  Resolved: 28/Feb/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 5.2.0, 5.0.5, 5.1.1
Fix Version/s: 6.0.0-rc0, 5.0.7, 5.3.0-rc2

Type: Bug Priority: Major - P3
Reporter: Pierlauro Sciarelli Assignee: Antonio Fuschetto
Resolution: Fixed Votes: 0
Labels: shardingemea-qw
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.3, v5.0
Steps To Reproduce:

mlaunch init --dir data --sharded 2 --replicaset 1 --csrs 1 --nodes 1 --verbose --port 20000
# grep topologyTime in logs and check that is greater than 0
mlaunch stop
mlaunch start
# grep topologyTime in logs and check that is equal to 0

Sprint: Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07
Participants:
Story Points: 5

 Description   

The current implementation of the vector clock is resilient to step-downs and rolling restarts of the config server because at least one of its nodes is alive, keeping the correct values of each component that are then gossiped out.

In case the whole config server goes down or simply gets restarted in a non-rolling fashion, the vector clock is reinitialized on the new CSRS primary with:

  • Cluster time = time of the last committed operation on the oplog.
  • Config time = 0 (for a very brief moment, as one majority committed write will tick it to the cluster time).
  • Topology time = 0 (as long as a shard is not successfully added/removed).

There are high probabilities that causal consistency would be broken in such scenario because:

  • Cluster/config times may go back in the past in case the system time is incorrect.
  • The topology time may be incorrect for a long time.


 Comments   
Comment by Githook User [ 01/Mar/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-62907 Vector clock components must survive CSRS non-rolling restart
Branch: v5.3
https://github.com/mongodb/mongo/commit/c3ca240424abf46c68499b300d879fe32df68a20

Comment by Githook User [ 01/Mar/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-62907 Vector clock components must survive CSRS non-rolling restart
Branch: v5.0
https://github.com/mongodb/mongo/commit/f54624cadf5ee6e5906016b5b8d41718f098fa81

Comment by Githook User [ 25/Feb/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-62907 Vector clock components must survive CSRS non-rolling restart
Branch: master
https://github.com/mongodb/mongo/commit/74280e7af5405e41d2f128d9a261e4d139cc8e71

Generated at Thu Feb 08 05:56:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.