Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36159

Log whenever the gossiped config server opTime term changes

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.15, 4.2.0-rc0, 4.0.13
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
    • Fully Compatible
    • v4.0, v3.6, v3.4
    • Sharding 2018-08-13, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25, Sharding 2019-05-20, Sharding 2019-06-03, Sharding 2019-06-17, Sharding 2019-07-01, Sharding 2019-08-12

      In mongodb 3.4 and earlier, the sharded cluster nodes gossip the config server's opTime in order to ensure they always read the latest routing metadata. This opTime contains both timestamp and a term and since it is only used internally between the cluster nodes is not signed or verified in any way.

      As part of a customer support case we observed the gossiped config server opTime term jump forward without it actually having changed on the config server itself. Such as jump could potentially happen due to DNS misconfiguration causing members of a sharded cluster to inadvertently talk to the wrong host and since there is no validation in 3.4 the term jumping forward could have disastrous consequences for the entire cluster.

      In order to help diagnose such issues we should have shard nodes log whenever the config server's opTime term changes. Such logging should also ideally include the node from which the new term came so that it can be traced back to the first node which caused it.

            kevin.pulo@mongodb.com Kevin Pulo
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            3 Vote for this issue
            16 Start watching this issue