Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89631

Make reading `ReplSetConfig` mostly lock-free

    • Replication
    • Fully Compatible
    • v8.0
    • Repl 2024-04-29, Repl 2024-05-13
    • 120

      We can use the following construct to store _rsConfig for the replication coordinator, allowing operations to query the configuration lock-free.

      class SharedReplSetConfig {
      public:
          struct Lease {
          public:
              Lease(uint64_t version, std::shared_ptr<ReplSetConfig> config) : _version(version), _config(std::move(config)) {}
              uint64_t version() const { return _version; }
              ReplSetConfig& config() const { return *_config; }
          private:
              uint64_t _version;
              std::shared_ptr<ReplSetConfig> _config;
          };
      
          SharedReplSetConfig() : _current(std::make_shared<ReplSetConfig>()) {}
      
          Lease renew() {
              auto readLock = _rwMutex.readLock();
              return Lease(_version.load(), _current);
          }
          
          bool isStale(Lease& lease) const {
              return _version.load() != lease.version();
          }
      
          void update(std::shared_ptr<ReplSetConfig> newConfig) {
              auto writeLk = _rwMutex.writeLock();
              _version.fetchAndAdd(1);
              _current = std::move(newConfig);
          }
      
      private:
          WriteRarelyRWMutex _rwMutex;
          Atomic<uint64_t> _version{0};
          std::shared_ptr<ReplSetConfig> _current;
      };
      

      Any thread that wants to update the configuration, must continue using the replication coordinator mutex to synchronize with other writers, and then call into `SharedReplSetConfig::update` to update the configuration.

      As for reading from the configuration, I propose the following:

      • Create a thread local of type SharedReplSetConfig::Lease, which holds a cache of the latest known configuration to that thread.
      • At first attempt to read the configuration, each thread calls into renew to update its local cache.
      • For all future reads from the configuration, each thread calls into isStale with its cached configuration. If it is not stale, they can use the configuration without acquiring any locks. On the very rare occasion that the cached configuration is stale, the thread will call into renew and renew its cached configuration.

      Here is an example of adopting the above in replication_coordinator_impl.cpp:

      bool ReplicationCoordinatorImpl::isConfigLocalHostAllowed() const {
          auto& myLease = fromThreadLocalConfigCache();
          if (MONGO_unlikely(_sharedReplSetConfig.isStale(myLease)) {
              myLease = _sharedReplSetConfig.renew();
              setThreadLocalConfigCache(myLease);
          }
          return myLease.config.isLocalHostAllowed();
      }
      

            Assignee:
            brad.cater@mongodb.com Brad Cater
            Reporter:
            amirsaman.memaripour@mongodb.com Amirsaman Memaripour
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: