Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-64433

A new topology time could be gossiped without being majority committed

    • Fully Compatible
    • ALL
    • v6.0, v5.0
    • Sharding EMEA 2022-03-21, Sharding EMEA 2022-04-04, Sharding EMEA 2022-04-18, Sharding EMEA 2022-05-02, Sharding EMEA 2022-05-16
    • 23

      Every time a shard is added or removed, we create a new topologyTime (let's call it T0 time) that is inserted in config.shards.  Afterwards, when this operation is locally committed (let's say that at Tcommit time), we store the value of T0 time in a in-memory data structure.  Finally, when the majority commit point is advanced to a TmajorityPoint time greater or equal than T0 time, we tick the configTime and advance the vector clock topologyTime to the T0 time.

      The problem of this approach is that we are advancing the topologyTime of the vector clock when TmajorityPoint >= T0, but this doesn't guarantee that the time associated to the oplog entry (i.e. Tcommit) was majority committed. Thus, we might end up gossiping a new topologyTime but when we a shard goes to the config server expecting to find an entry in config.shards with a topologyTime of T0, it might happen that it doesn't find it.

      Note that the topologyTime is a time but it doesn't provide any guarantee about what you will find in config.shards. It could be seen just as a counter that it is ticked every time we perform an add/remove shard operation.

            Assignee:
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            Reporter:
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: