Loading...

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.0.3, 5.0.0
Component/s: None
Labels:
- sharding-product-sync

Assigned Teams:

Cluster Scalability
Operating System:
ALL
Steps To Reproduce:
Hide

Create a 3-node replicaset with auth disabled.
E.g. using mlaunch:

mlaunch init --replicaset --dir ~/data/testing

Connect to the replicaset and create a read-only user.

use admin db.createUser({user: "read", pwd: "12345", roles: [ { role: "read", db: "admin" } ] })

Generate a random keyfile.

openssl rand -base64 768 > keyfile.txt

Restart all replicaset processes with the transitionToAuth flag enabled (using the keyfile for internal auth) one at a time.

# Kill the mongod process listening on port 27017, then start a new one. ps aux | grep 27017 kill <pid> mongod --transitionToAuth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs1/db --logpath ~/data/testing/replset/rs1/mongod.log --port 27017 --fork # Repeat for the mongod processes listening on port 27018 and 27019.

Restart each secondary process in the replicaset, removing the --transitionToAuth flag and enabling the --auth flag.
For example, if the primary is listening on port 27017:

# Kill the mongod process listening on port 27018, then start a new one. ps aux | grep 27018 kill <pid> mongod --auth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs2/db --logpath ~/data/testing/replset/rs2/mongod.log --port 27018 --fork # Repeat for the mongod process listening on port 27019.

Send a "ping" command to each mongod and observe the mixed "$clusterTime" responses using the same authentication parameters.
For example, if the primary is listening on port 27017:

# Returns dummy-signed $clusterTime from the primary with "keyId: 0". mongo --port 27017 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27018 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27019 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})'

You've created a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime with keyId: 0 from the primary and signed $clusterTime with a real keyId from the secondaries using the same auth parameters. When connecting to all replicaset nodes with a single client instance, that replicaset state creates a race condition between a client advancing its $clusterTime timestamp and the secondaries updating their $clusterTime timestamp. To actually observe the KeyNotFound error, the replicaset secondaries must have a $clusterTime timestamp behind the primary and behind the client that is gossiping $clusterTime. In that case, a client that writes to the primary and reads from a secondary will get a KeyNotFound error from the secondary until it "catches up" to the same $clusterTime timestamp as the client.

Note that I confirmed this works with server v4.0.3 and 5.0.0, but I believe this affects every server version that supports transitionToAuth.
Show
Create a 3-node replicaset with auth disabled. E.g. using mlaunch : mlaunch init --replicaset --dir ~/data/testing Connect to the replicaset and create a read-only user. use admin db.createUser({user: "read" , pwd: "12345" , roles: [ { role: "read" , db: "admin" } ] }) Generate a random keyfile. openssl rand -base64 768 > keyfile.txt Restart all replicaset processes with the transitionToAuth flag enabled (using the keyfile for internal auth) one at a time. # Kill the mongod process listening on port 27017, then start a new one. ps aux | grep 27017 kill <pid> mongod --transitionToAuth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs1/db --logpath ~/data/testing/replset/rs1/mongod.log --port 27017 --fork # Repeat for the mongod processes listening on port 27018 and 27019. Restart each secondary process in the replicaset, removing the --transitionToAuth flag and enabling the --auth flag. For example, if the primary is listening on port 27017: # Kill the mongod process listening on port 27018, then start a new one. ps aux | grep 27018 kill <pid> mongod --auth --keyFile keyfile.txt --replSet replset --dbpath ~/data/testing/replset/rs2/db --logpath ~/data/testing/replset/rs2/mongod.log --port 27018 --fork # Repeat for the mongod process listening on port 27019. Send a "ping" command to each mongod and observe the mixed "$clusterTime" responses using the same authentication parameters. For example, if the primary is listening on port 27017: # Returns dummy-signed $clusterTime from the primary with "keyId: 0" . mongo --port 27017 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27018 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})' # Returns signed $clusterTime with real keyId. mongo --port 27019 -u "read" -p "12345" --authenticationDatabase "admin" --eval 'db.runCommand({ping:1})' You've created a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime with keyId: 0 from the primary and signed $clusterTime with a real keyId from the secondaries using the same auth parameters. When connecting to all replicaset nodes with a single client instance, that replicaset state creates a race condition between a client advancing its $clusterTime timestamp and the secondaries updating their $clusterTime timestamp. To actually observe the KeyNotFound error, the replicaset secondaries must have a $clusterTime timestamp behind the primary and behind the client that is gossiping $clusterTime. In that case, a client that writes to the primary and reads from a secondary will get a KeyNotFound error from the secondary until it "catches up" to the same $clusterTime timestamp as the client. Note that I confirmed this works with server v4.0.3 and 5.0.0, but I believe this affects every server version that supports transitionToAuth .
Sprint:
Security 2022-02-07, Security 2022-02-21, Security 2022-03-07, Security 2022-05-02, Security 2022-07-25, Security 2022-08-08, Security 2022-08-22, Security 2022-09-05, Security 2022-09-19, Security 2022-10-03, Security 2022-10-17, Security 2022-10-31, Security 2022-11-14, Security 2022-11-28, Security 2022-12-12, Security 2022-12-26, Security 2023-01-09, Security 2023-01-23, Security 2023-02-06
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Using the transitionToAuth flag, it's possible to create a working replicaset with mixed authentication requirements that will return a dummy-signed $clusterTime document from the primary and actually signed $clusterTime documents from the secondaries using the same client authentication parameters (see this comment for more info). In that case, a client may attempt to gossip a dummy-signed $clusterTime to nodes that require a real $clusterTime signature. If that happens and the secondaries have a $clusterTime timestamp older than the client's, the secondaries will return a KeyNotFound error instead of the expected response.

See https://jira.mongodb.org/browse/DRIVERS-1904 for the related drivers ticket.

related to

DRIVERS-1904 Handle invalid $clusterTime documents when gossiping cluster time

Backlog

Details

Description

Attachments

Issue Links

Activity

People

Dates