Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62486

Gossiping cluster time in replicaset with "--transitionToAuth" can cause KeyNotFound error

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • 4.0.3, 5.0.0
    • None
    • Cluster Scalability
    • ALL
    • Hide
      1. Create a 3-node replicaset with auth disabled.
        E.g. using mlaunch:

        mlaunch init --replicaset --dir ~/data/testing
        

      2. Connect to the replicaset and create a read-only user.

        use admin
        db.createUser({user: "read", pwd: "12345", roles: [ { role: "read", db: "admin" } ] })
        

      3. Generate a random keyfile.

        openssl rand -base64 768 > keyfile.txt
        

      4. Restart all replicaset processes with the transitionToAuth flag enabled (using the keyfile for internal auth) one at a time.

        # Kill the mongod process listening on port 27017, then start a new one.
        ps aux | grep 27017
        kill <pid>
        mongod --transitionToAuth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs1/db --logpath ~/data/testing/replset/rs1/mongod.log --port 27017 --fork
        # Repeat for the mongod processes listening on port 27018 and 27019.
        

      5. Restart each secondary process in the replicaset, removing the --transitionToAuth flag and enabling the --auth flag.
        For example, if the primary is listening on port 27017:

        # Kill the mongod process listening on port 27018, then start a new one.
        ps aux | grep 27018
        kill <pid>
        mongod --auth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs2/db --logpath ~/data/testing/replset/rs2/mongod.log --port 27018 --fork
        # Repeat for the mongod process listening on port 27019.
        

      6. Send a "ping" command to each mongod and observe the mixed "$clusterTime" responses using the same authentication parameters.
        For example, if the primary is listening on port 27017:

        # Returns dummy-signed $clusterTime from the primary with "keyId: 0".
        mongo --port 27017 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})'
        # Returns signed $clusterTime with real keyId.
        mongo --port 27018 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})'
        # Returns signed $clusterTime with real keyId.
        mongo --port 27019 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})'
        

      You've created a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime with keyId: 0 from the primary and signed $clusterTime with a real keyId from the secondaries using the same auth parameters. When connecting to all replicaset nodes with a single client instance, that replicaset state creates a race condition between a client advancing its $clusterTime timestamp and the secondaries updating their $clusterTime timestamp. To actually observe the KeyNotFound error, the replicaset secondaries must have a $clusterTime timestamp behind the primary and behind the client that is gossiping $clusterTime. In that case, a client that writes to the primary and reads from a secondary will get a KeyNotFound error from the secondary until it "catches up" to the same $clusterTime timestamp as the client.

      Note that I confirmed this works with server v4.0.3 and 5.0.0, but I believe this affects every server version that supports transitionToAuth.

      Show
      Create a 3-node replicaset with auth disabled. E.g. using mlaunch : mlaunch init --replicaset -- dir ~ /data/testing Connect to the replicaset and create a read-only user. use admin db.createUser({user: "read", pwd: "12345", roles: [ { role: "read", db: "admin" } ] }) Generate a random keyfile. openssl rand -base64 768 > keyfile.txt Restart all replicaset processes with the transitionToAuth flag enabled (using the keyfile for internal auth) one at a time. # Kill the mongod process listening on port 27017, then start a new one. ps aux | grep 27017 kill <pid> mongod --transitionToAuth --keyFile keyfile.txt --replSet replset --dbpath ~ /data/testing/replset/rs1/db --logpath ~ /data/testing/replset/rs1/mongod .log --port 27017 --fork # Repeat for the mongod processes listening on port 27018 and 27019. Restart each secondary process in the replicaset, removing the --transitionToAuth flag and enabling the --auth flag. For example, if the primary is listening on port 27017: # Kill the mongod process listening on port 27018, then start a new one. ps aux | grep 27018 kill <pid> mongod --auth --keyFile keyfile.txt --replSet replset --dbpath ~ /data/testing/replset/rs2/db --logpath ~ /data/testing/replset/rs2/mongod .log --port 27018 --fork # Repeat for the mongod process listening on port 27019. Send a "ping" command to each mongod and observe the mixed "$clusterTime" responses using the same authentication parameters. For example, if the primary is listening on port 27017: # Returns dummy-signed $clusterTime from the primary with "keyId: 0". mongo --port 27017 -u "read" -p "12345" --authenticationDatabase "admin" -- eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27018 -u "read" -p "12345" --authenticationDatabase "admin" -- eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27019 -u "read" -p "12345" --authenticationDatabase "admin" -- eval 'db.runCommand({ping:1})' You've created a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime with keyId: 0 from the primary and signed $clusterTime with a real keyId from the secondaries using the same auth parameters. When connecting to all replicaset nodes with a single client instance, that replicaset state creates a race condition between a client advancing its $clusterTime timestamp and the secondaries updating their $clusterTime timestamp. To actually observe the KeyNotFound error, the replicaset secondaries must have a $clusterTime timestamp behind the primary and behind the client that is gossiping $clusterTime. In that case, a client that writes to the primary and reads from a secondary will get a KeyNotFound error from the secondary until it "catches up" to the same $clusterTime timestamp as the client. Note that I confirmed this works with server v4.0.3 and 5.0.0, but I believe this affects every server version that supports transitionToAuth .
    • Security 2022-02-07, Security 2022-02-21, Security 2022-03-07, Security 2022-05-02, Security 2022-07-25, Security 2022-08-08, Security 2022-08-22, Security 2022-09-05, Security 2022-09-19, Security 2022-10-03, Security 2022-10-17, Security 2022-10-31, Security 2022-11-14, Security 2022-11-28, Security 2022-12-12, Security 2022-12-26, Security 2023-01-09, Security 2023-01-23, Security 2023-02-06

    Description

      Using the transitionToAuth flag, it's possible to create a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime document from the primary and actually signed $clusterTime documents from the secondaries using the same client authentication parameters (see this comment for more info). In that case, a client may attempt to gossip a dummy-signed $clusterTime to nodes that require a real $clusterTime signature. If that happens and the secondaries have a $clusterTime timestamp older than the client's, the secondaries will return a KeyNotFound error instead of the expected response.

      See https://jira.mongodb.org/browse/DRIVERS-1904 for the related drivers ticket.

      Attachments

        Activity

          People

            backlog-server-cluster-scalability Backlog - Cluster Scalability
            matt.dale@mongodb.com Matt Dale
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: