[JAVA-5256] Switching replicas IP with a replica from a different replicaset can result in java driver obtaining HMAC keyId from the different replicaset Created: 02/Dec/23  Updated: 10/Jan/24  Resolved: 05/Jan/24

Status: Closed
Project: Java Driver
Component/s: Cluster Management
Affects Version/s: None
Fix Version/s: 4.11.2

Type: Improvement Priority: Major - P3
Reporter: Michał Knasiecki Assignee: Jeffrey Yemin
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-83571 Errors like "No keys found for HMAC t... Backlog
related to DRIVERS-2798 Gossiping the cluster time from monit... Backlog
Documentation Changes: Not Needed
Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?


 Description   

Summary

When a replica IP changes to a different IP address and its previous IP address is assigned to a replica from another replicaset (e.g. due to a kubernetes pod reboot) and the DNS cache does not  refresh yet, the driver can connect to the replica of another replicaset and get its keyId. As a result, the keyId will be put in the driver's session and can be used, leading to the following error: 

No keys found for HMAC that is valid for time[...] 

MongoDB 6.0.8 replicaset topology: PSSSA and mongodb-driver-sync 4.8.2 but we have also seen this on other versions of the driver

Replicas are configured with hostnames. 

Additional Background

The following description shows one of the observed cases. For obvious reasons, I replaced the real addresses and keys with other literals.

Below you can find how to reproduce it.

The client was connected to a PSSSA replica set, let's call it a replicaset-a

One of the members of the replica set (let's call it replicaset-a-member-1) works on some physical machine and there are members of other replicasets on this machine as well. Among them there is a member from the repliaset-b, let's call it replicaset-b-member-1

The physical server has been rebooted.

In the logs on the client side, we see information about the member being disconnected:

The server is in quiesce mode and will shut down' on server replicaset-a-member-1

After a short time, the pods began to start, the first one to get up was replicaset-b-member-1, and it was assigned the IP address that belonged to replicaset-a-member-1 a moment ago.

In the client's logs we see 2 entries.

The first one says that the driver has connected again to the replicaset-a-member-1, but in the response we can see that they data came from replicaset-b:

Log-entry-1:

Monitor thread successfully connected to server with description ServerDescription{address=replicaset-a-member-1, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=622150, setName='replicaset-b', canonicalAddress=replicaset-b-address, hosts=[replicaset-b-member-1, replicaset-b-member-2, replicaset-b-member-3, replicaset-b-member-4], passives=[], arbiters=[arbiter], primary='replicaset-b-member-2',

You can see above that on the one hand the log says that the client has connected to a member of the replicaset-a, but the topology itself comes from replicaset-b.

A second entry immediately follows:

Log-entry-2:

Expecting replica set member from set 'replicaset-a', but found one from set 'replicaset-b'. Removing replicaset-a-member-1 from client view of cluster

So it looks like the driver figures out that it has connected to the wrong replica set and removes the member from the list of available ones.

These two messages repeat for some time (until the real replicaset-a-member-1 starts), and then the following entry appears:

Adding discovered server replicaset-a-member-1 to client view of cluster

So, the replicaset-a node starts up, gets a different IP address, the driver connects to it, and everything looks correct.

Unfortunately, from that point on, every query to that node ends up with an HMAC error, which says that the client tried to get to replicaset-a using the keyId from replicaset-b:

{{Command failed with error 211 (KeyNotFound): 'No keys found for HMAC that is valid for time:

{ ts: Timestamp(XXX, 6130) }

with id: replicaset-b-keyId' on server replicaset-a-member-1. The full response is {"ok": 0.0, "errmsg": "No keys found for HMAC that is valid for time:

{ ts: Timestamp(XXX, 6130) }

with id: replicaset-b-keyId", "code": 211, "codeName": "KeyNotFound", "$clusterTime": {"clusterTime": {"$timestamp": {"t": YYY, "i": 2}}, "signature": {"hash": {"$binary": {"base64": "XYZ", "subType": "00"}}, "keyId": replicaset-a-keyId}}, "operationTime": {"$timestamp":

{"t": ZZZ, "i": 2}

}}}}

Apparently, during the connection attempt loop on the old IP address, the driver somehow manages to retrieve the keyId from replicaset-b and saves it in the session.

My suspicion is that health check (which I guess is used to get keyId as well) is executed by a separate thread that is not synchronized in any way with the thread that is responsible for log-entry-1 and log-entry-2, and that it manages to execute right between those operations. That is:

  1. The driver connects to replicaset-b-member-1, log-entry-1 appears
  2. Another thread (running periodically) also connects to replicaset-b-member-1 and retrieves the wrong keyId without checking that it comes from the wrong replicaset
  3. The driver detects that it is a connection to another replicaset, log-entry-2 appears, but does not clear the previously loaded key-id

When the correct node comes back, the driver still has the keyId from another replicaset in the session and tries to use it, which ends in an error. Interestingly, sometimes this lasts for a short time (1 error), sometimes for minutes or hours, and in one case we've been observing this for many days, it stopped after a client application restart. I can't explain this yet.

How to Reproduce

  1. set the parameter heartbeatFrequencyMS=10 - this increases the probability that the thread retrieving the keyId will hit the right spot between connecting to the wrong member and removing it from the cluster view,
  2. in the /etc/hosts file, enter hostname of the replicaset-a-member-1 name and redirect it to the IP address of replicaset-b-member-1
  3. set a breakpoint here: https://github.com/mongodb/mongo-java-driver/blob/master/driver-core/src/main/com/mongodb/internal/connection/ClusterClock.java#L54 I use a conditional breakpoint so that the code falls into it only when the keyId is different than the one from the replicaset-a
  4. start the application and wait
  5. Sometimes it happens just after 1-3 minutes and sometimes I have to repeat the whole operation once or twice, sometimes removing the hosts entry also "helps", but as a rule it is pretty easy to reproduce.

Out of 5 attempts, it always succeeds once, but I would appreciate it if someone else tries it.

 



 Comments   
Comment by Githook User [ 05/Jan/24 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}

Message: Don't gossip cluster time from monitoring connections (#1276)

JAVA-5256
Branch: 4.11.x
https://github.com/mongodb/mongo-java-driver/commit/9a3c9ddc3b56eb80cb4e0ed9f3617c7c31fdb8d3

Comment by Githook User [ 05/Jan/24 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}

Message: Don't gossip cluster time from monitoring connections (#1276)

JAVA-5256
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/1955b75fbca2029014d4abfe55303e3f4639c061

Comment by Jeffrey Yemin [ 20/Dec/23 ]

However, the situation you described is actually possible, and from what I understand the fix will probably require a change on the server side..

Not sure what the server can do in if a client is actually connecting accidentally to two different deployments. Erroring is maybe the the best outcome in that situation.

Comment by Michał Knasiecki [ 20/Dec/23 ]

jeff.yemin@mongodb.com After thinking about it, I came to the conclusion that the problem you described with mongo-s is unlikely to occur on my environment, because we do not share physical servers between mongo-s instances belonging to different clusters (as I said before - we have fewer sharded clusters so we can spread them easily among all the servers). Since pods of mongo-s of different clusters are never working on the same host, there's no chance for exchange of IP addresses between them. However, the situation you described is actually possible, and from what I understand the fix will probably require a change on the server side...

Comment by Michał Knasiecki [ 19/Dec/23 ]

Hi jeff.yemin@mongodb.com
I've never had this issue on a sharded cluster, but 99% of our clusters are replicasets, so it could be because of this.

Comment by Jeffrey Yemin [ 19/Dec/23 ]

Hi mknasiecki@gmail.com I'm just wondering if you have any sharded cluster use cases as well. I'm concerned that the sharded scenario is even worse, as unlike with replica set members there is no check that the driver is able to perform that can verify that all the mongos servers are from the same deployment.

Comment by Michał Knasiecki [ 11/Dec/23 ]

Hello jeff.yemin@mongodb.com,

when it comes to different ports: we tested this with two replicaset with different ports and it was fine, so yes - using different ports solves the problem (or maybe I should say: causes the problem does not appear). The thing is that we have a lot of replicasets and we do not manage them manually, but with a special orchestrator (you can think of it as a k8s operator but it's not 100% true), so switching to different ports would mean for us a lot of work and a long process of deploying. That's we're hoping for this fix in the driver.

 

And regarding the number of reports: when I was searching the Internet for the this error I came across several reports that resulted with a suggestion to update the server or the driver but there was no confirmation that this helped (like this). There were also people who said that after a reboot everything started working fine (like this). So I think there are some people who haven't solved this problem yet or the problem solved itself. Moreover it is very rare if you have a small scale environment. The problem also affects only virtual environments with dynamic IPs, like k8s. I have a strong feeling that most people who use K8s and Community Edition have at most few clusters there, and if someone has a lot of them, then he chooses Atlas. Although I'm not sure how it works in Percona operator - do they use same or different ports.

 

 

Comment by Jeffrey Yemin [ 11/Dec/23 ]

Hi mknasiecki@gmail.com We're trying to figure out why we're not getting more widespread reports of this from our other Kubernetes users. Is there anything that you're doing that seems out of the ordinary? The one thing that might be unusual is:

One of the members of the replica set works on some physical machine and there are members of other replica sets on this machine as well.

And in addition the replica set members run on the same port.

Would it be possible, for example, to configure each replica set to connect on a distinct port? I wonder if that's what others are doing.

Comment by Michał Knasiecki [ 08/Dec/23 ]

jeff.yemin@mongodb.com thanks for taking care of this even though you were unable to reproduce the error. I built and tested the version from this branch and it seems to work!

I wasn't able to reproduce the bug on the patched version, then  - just for sure - I reverted it 4.8.2 it occurred again. 

I'm looking forward to the release.

Thanks again.

Comment by Jeffrey Yemin [ 07/Dec/23 ]

Hi mknasiecki@gmail.com I thought a bit more about this and may have a driver fix for you. If you could build and test the code from https://github.com/mongodb/mongo-java-driver/pull/1276, I would be much obliged as I'm not in a great position to reproduce the scenario you described.

Comment by Jeffrey Yemin [ 07/Dec/23 ]

Hi there, thank you for reaching out. The driver appears to be operating per spec, so I don't think there is a fix we can just make in the driver to address this issues. I think it would benefit from a more holistic view through other support channels, e.g.:

  • Our MongoDB support portal, located at support.mongodb.com
  • Our MongoDB community portal, located here
  • If you are an Atlas customer, you can also: Click the in-app chat icon in the lower right corner to chat with a MongoDB Support representative OR Click Support in the left-hand navigation to view Developer Resources.

If you decide to pursue this further, please include a link to this issue so that others know that you already tried this avenue.

Thank you!

Comment by PM Bot [ 02/Dec/23 ]

Hi mknasiecki@gmail.com, thank you for reporting this issue! The team will look into it and get back to you soon.

Generated at Thu Feb 08 09:04:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.