[SERVER-53523] Do not use stale VectorClock on FCV 4.4 Created: 24/Dec/20  Updated: 29/Oct/23  Resolved: 01/Jan/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.7.0
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Tommaso Tocci
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:
Linked BF Score: 14

 Description   

Starting from binary version 4.7, the configTime component of the VectorClock is used both for readPreference and readConcern on read operations sent to the configServer.

On the other hand nodes with FCV <= 4.4 are not gossiping this component of the vector clock.

So as soon as the FCV 4.4 is set on the cluster we can't rely anymore on the vector clock and we should rather use the old configOpTime stored in the Grid.

While shards are correctly relying on Grid::configOpTime() when the FCV is set thanks to this check, routers have no knowledge of FCV so the check will always pass and the last known VectorClock's configTime is going to be used (if greater than zero).

These are two possible solutions I was thinking about:

  • Use always the maximum between the old configOpTime and the new VectorClock[configTime] until we completely get rid of the former (v5.1). This should be a quick fix.
  • Use always the old configOpTime until we can completely switch to the new VectorClock[configTime]  (v5.1). Also this one is a quick one. Actually I'm not sure why we already started using the new VectorClock[configTime] if it is still unreliable.
  • Update the VectorClock[configTime] every time the old configOpTime is advanced. Also this should be quick one but it requires to mess a bit the code of the VectorClock.


 Comments   
Comment by Ian Whalen (Inactive) [ 04/Jan/21 ]

Author:

{'username': u'evrg-bot-webhook', 'name': u'Tommaso Tocci', 'email': u'tommaso.tocci@mongodb.com'}

Message:SERVER-53523 Do not use stale VectorClock on FCV 4.4
Branch:master
https://github.com/mongodb/mongo/commit/c3cb41d5d88308632d2da0c3f0d047487c3b66c3

Comment by Tommaso Tocci [ 01/Jan/21 ]

Commit: https://github.com/mongodb/mongo/commit/c3cb41d5d88308632d2da0c3f0d047487c3b66c3

Generated at Thu Feb 08 05:31:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.