-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Cluster Management
-
None
-
Not Needed
-
Summary
When a replica IP changes to a different IP address and its previous IP address is assigned to a replica from another replicaset (e.g. due to a kubernetes pod reboot) and the DNS cache does not refresh yet, the driver can connect to the replica of another replicaset and get its keyId. As a result, the keyId will be put in the driver's session and can be used, leading to the following error:
No keys found for HMAC that is valid for time[...]
MongoDB 6.0.8 replicaset topology: PSSSA and mongodb-driver-sync 4.8.2 but we have also seen this on other versions of the driver
Replicas are configured with hostnames.
Additional Background
The following description shows one of the observed cases. For obvious reasons, I replaced the real addresses and keys with other literals.
Below you can find how to reproduce it.
The client was connected to a PSSSA replica set, let's call it a replicaset-a
One of the members of the replica set (let's call it replicaset-a-member-1) works on some physical machine and there are members of other replicasets on this machine as well. Among them there is a member from the repliaset-b, let's call it replicaset-b-member-1
The physical server has been rebooted.
In the logs on the client side, we see information about the member being disconnected:
The server is in quiesce mode and will shut down' on server replicaset-a-member-1
After a short time, the pods began to start, the first one to get up was replicaset-b-member-1, and it was assigned the IP address that belonged to replicaset-a-member-1 a moment ago.
In the client's logs we see 2 entries.
The first one says that the driver has connected again to the replicaset-a-member-1, but in the response we can see that they data came from replicaset-b:
Log-entry-1:
Monitor thread successfully connected to server with description ServerDescription{address=replicaset-a-member-1, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=622150, setName='replicaset-b', canonicalAddress=replicaset-b-address, hosts=[replicaset-b-member-1, replicaset-b-member-2, replicaset-b-member-3, replicaset-b-member-4], passives=[], arbiters=[arbiter], primary='replicaset-b-member-2',
You can see above that on the one hand the log says that the client has connected to a member of the replicaset-a, but the topology itself comes from replicaset-b.
A second entry immediately follows:
Log-entry-2:
Expecting replica set member from set 'replicaset-a', but found one from set 'replicaset-b'. Removing replicaset-a-member-1 from client view of cluster
So it looks like the driver figures out that it has connected to the wrong replica set and removes the member from the list of available ones.
These two messages repeat for some time (until the real replicaset-a-member-1 starts), and then the following entry appears:
Adding discovered server replicaset-a-member-1 to client view of cluster
So, the replicaset-a node starts up, gets a different IP address, the driver connects to it, and everything looks correct.
Unfortunately, from that point on, every query to that node ends up with an HMAC error, which says that the client tried to get to replicaset-a using the keyId from replicaset-b:
{{Command failed with error 211 (KeyNotFound): 'No keys found for HMAC that is valid for time:
{ ts: Timestamp(XXX, 6130) }with id: replicaset-b-keyId' on server replicaset-a-member-1. The full response is {"ok": 0.0, "errmsg": "No keys found for HMAC that is valid for time:
{ ts: Timestamp(XXX, 6130) }with id: replicaset-b-keyId", "code": 211, "codeName": "KeyNotFound", "$clusterTime": {"clusterTime": {"$timestamp": {"t": YYY, "i": 2}}, "signature": {"hash": {"$binary": {"base64": "XYZ", "subType": "00"}}, "keyId": replicaset-a-keyId}}, "operationTime": {"$timestamp":
{"t": ZZZ, "i": 2}}}}}
Apparently, during the connection attempt loop on the old IP address, the driver somehow manages to retrieve the keyId from replicaset-b and saves it in the session.
My suspicion is that health check (which I guess is used to get keyId as well) is executed by a separate thread that is not synchronized in any way with the thread that is responsible for log-entry-1 and log-entry-2, and that it manages to execute right between those operations. That is:
- The driver connects to replicaset-b-member-1, log-entry-1 appears
- Another thread (running periodically) also connects to replicaset-b-member-1 and retrieves the wrong keyId without checking that it comes from the wrong replicaset
- The driver detects that it is a connection to another replicaset, log-entry-2 appears, but does not clear the previously loaded key-id
When the correct node comes back, the driver still has the keyId from another replicaset in the session and tries to use it, which ends in an error. Interestingly, sometimes this lasts for a short time (1 error), sometimes for minutes or hours, and in one case we've been observing this for many days, it stopped after a client application restart. I can't explain this yet.
How to Reproduce
- set the parameter heartbeatFrequencyMS=10 - this increases the probability that the thread retrieving the keyId will hit the right spot between connecting to the wrong member and removing it from the cluster view,
- in the /etc/hosts file, enter hostname of the replicaset-a-member-1 name and redirect it to the IP address of replicaset-b-member-1
- set a breakpoint here: https://github.com/mongodb/mongo-java-driver/blob/master/driver-core/src/main/com/mongodb/internal/connection/ClusterClock.java#L54 I use a conditional breakpoint so that the code falls into it only when the keyId is different than the one from the replicaset-a
- start the application and wait
- Sometimes it happens just after 1-3 minutes and sometimes I have to repeat the whole operation once or twice, sometimes removing the hosts entry also "helps", but as a rule it is pretty easy to reproduce.
Out of 5 attempts, it always succeeds once, but I would appreciate it if someone else tries it.
- related to
-
SERVER-83571 Errors like "No keys found for HMAC that is valid for time" should be included in server logs.
- Backlog
-
DRIVERS-2798 Gossiping the cluster time from monitoring connections can result in loss of availability
- In Review