Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36159

Log whenever the gossiped config server opTime term changes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 3.6.15, 4.2.0-rc0, 4.0.13
    • Sharding
    • None
    • Fully Compatible
    • v4.0, v3.6, v3.4
    • Sharding 2018-08-13, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25, Sharding 2019-05-20, Sharding 2019-06-03, Sharding 2019-06-17, Sharding 2019-07-01, Sharding 2019-08-12

    Description

      In mongodb 3.4 and earlier, the sharded cluster nodes gossip the config server's opTime in order to ensure they always read the latest routing metadata. This opTime contains both timestamp and a term and since it is only used internally between the cluster nodes is not signed or verified in any way.

      As part of a customer support case we observed the gossiped config server opTime term jump forward without it actually having changed on the config server itself. Such as jump could potentially happen due to DNS misconfiguration causing members of a sharded cluster to inadvertently talk to the wrong host and since there is no validation in 3.4 the term jumping forward could have disastrous consequences for the entire cluster.

      In order to help diagnose such issues we should have shard nodes log whenever the config server's opTime term changes. Such logging should also ideally include the node from which the new term came so that it can be traced back to the first node which caused it.

      Attachments

        Issue Links

          Activity

            People

              kevin.pulo@mongodb.com Kevin Pulo
              kaloian.manassiev@mongodb.com Kaloian Manassiev
              Votes:
              3 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: