Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32639

Arbiters in standalone replica sets can't sign or validate clusterTime with auth on once FCV checks are removed

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.7, 3.7.4
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Fully Compatible
    • ALL
    • v3.6
    • Sharding 2018-03-26, Sharding 2018-04-09, Sharding 2018-04-23

      Because arbiters don't persist any data, they never replicate the admin.system.keys collection, which replica set members read from locally to load the keys used for clusterTime signing and validation. Currently, this issue was masked in our tests because clusterTimes aren't signed or validated when FCV is not fully upgraded to v3.6, but since arbiters also don't persist the admin.system.version collection, they never update their in-memory FCV and stay at v3.4. After we remove the FCV checks in SERVER-32463, non __system users will be no longer be able to communicate with arbiters in standalone replica sets with auth on, unless they have the advanceClusterTime privilege or don't gossip clusterTime.

      This shouldn't be a problem in sharded clusters, since keys are only persisted on the CSRS and are cached in memory on every other node in the cluster.

      Implementation:

      The simplest and most efficient way is not to install logical clock if its an arbiter node: https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L780
      However at the time of this check the node is not yet processed configuration and will not be able to recognize if its isArbiter or not. Hence another approach is chosen.

      Sharding part: jack.mulrow please ack.

      1. Add

      bool LogicalClock::isEnabled() const;
      bool _isEnabled {true};  // initially is enabled to allow normal RS initialization
      

      and

      void LogicalClock::setEnabled(bool) 
      

      Alternative: do not create LogicalClock optionally anymore: I prefer to not change more than needed mostly because the performance is hurt by taking the lock every time LogicalClock data is accessed.

      2. Other LogicalClock public methods should invariant(_isEnabled); so there is no accidental calls to a disabled logical clock.
      Therefor the responsibility to check is on the caller.

      3.Do not validate or advance logicalTime if its not enabled. https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/rpc/metadata.cpp#L102
      4.Do not append LogicalTime metadata if LogicalClock is not enabled at https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/service_entry_point_common.cpp#L264, https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/service_entry_point_common.cpp#L292

      Replication part: judah.schvimer please ack.
      If the node is a ReplicaSet arbiter member but its sharding state is not enabled then logical clock should not be enabled.

      5. https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L539 initilizes replication coordinator so it can tell if the current node is an arbiter.
      This will be the place where to disable the logical clock if it is an arbiter.

      Note: Monitor keys: https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L527 happens before this line. I dont think it can be moved later as keys may be needed for proper initialization of oplog

            Assignee:
            misha.tyulenev@mongodb.com Misha Tyulenev (Inactive)
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: