[SERVER-63742] Default topology time in shard can lead to infinite refresh in shard registry Created: 16/Feb/22  Updated: 29/Oct/23  Resolved: 02/Mar/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 6.0.0-rc0, 5.0.7, 5.3.0-rc3

Type: Bug Priority: Critical - P2
Reporter: Marcos José Grillo Ramirez Assignee: Antonio Fuschetto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.3, v5.0
Sprint: Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07
Participants:
Case:

 Description   

If a recently started shard has to write into config.vectorClock (for example, when becoming a coordinator of a 2PC transaction) it will try to insert the value Timestamp(0, 0) into the collection. However, this value gets replaced by 'now' before being inserted, and this vector clock value can be gossiped back to the routers, making the read trough cache of the ShardRegistry to advance the time in store to said gossiped value. If the topologyTime stored in the config server (in config.shards) is less than the new time in store, the ShardRegistry will stall all operations when trying to get a shard, because it will always try to refresh the cache, but it will not be able to find a time higher to the one already stored.

This stall in the ShardRegistry can cause any operation from mongos which must contact any shard to stall.



 Comments   
Comment by Githook User [ 02/Mar/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-63742 Default topology time in shard can lead to infinite refresh in shard registry
Branch: v5.0
https://github.com/mongodb/mongo/commit/004c48e11d879257cbfbced5570597ccc0fbbd27

Comment by Githook User [ 02/Mar/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-63742 Default topology time in shard can lead to infinite refresh in shard registry
Branch: v5.3
https://github.com/mongodb/mongo/commit/c41763970932ee2534ad932d557d322ce94f692f

Comment by Githook User [ 02/Mar/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-63742 Default topology time in shard can lead to infinite refresh in shard registry
Branch: master
https://github.com/mongodb/mongo/commit/22140dacf9c355f702da3c5d4892df709ceead66

Comment by Cris Insignares Cuello [ 17/Feb/22 ]

tommaso.tocci please add the ticket to implement test that covers this use case.

Comment by Marcos José Grillo Ramirez [ 17/Feb/22 ]

kelsey.schubert all versions starting from 5.0.

Comment by Kelsey Schubert [ 16/Feb/22 ]

Which versions are affected?

Generated at Thu Feb 08 05:58:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.