[JAVA-5256] Switching replicas IP with a replica from a different replicaset can result in java driver obtaining HMAC keyId from the different replicaset Created: 02/Dec/23 Updated: 10/Jan/24 Resolved: 05/Jan/24 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Cluster Management |
| Affects Version/s: | None |
| Fix Version/s: | 4.11.2 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Michał Knasiecki | Assignee: | Jeffrey Yemin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Documentation Changes: | Not Needed | ||||||||||||
| Documentation Changes Summary: | 1. What would you like to communicate to the user about this feature? |
||||||||||||
| Description |
SummaryWhen a replica IP changes to a different IP address and its previous IP address is assigned to a replica from another replicaset (e.g. due to a kubernetes pod reboot) and the DNS cache does not refresh yet, the driver can connect to the replica of another replicaset and get its keyId. As a result, the keyId will be put in the driver's session and can be used, leading to the following error:
MongoDB 6.0.8 replicaset topology: PSSSA and mongodb-driver-sync 4.8.2 but we have also seen this on other versions of the driver Replicas are configured with hostnames. Additional BackgroundThe following description shows one of the observed cases. For obvious reasons, I replaced the real addresses and keys with other literals. Below you can find how to reproduce it. The client was connected to a PSSSA replica set, let's call it a replicaset-a One of the members of the replica set (let's call it replicaset-a-member-1) works on some physical machine and there are members of other replicasets on this machine as well. Among them there is a member from the repliaset-b, let's call it replicaset-b-member-1 The physical server has been rebooted. In the logs on the client side, we see information about the member being disconnected: The server is in quiesce mode and will shut down' on server replicaset-a-member-1 After a short time, the pods began to start, the first one to get up was replicaset-b-member-1, and it was assigned the IP address that belonged to replicaset-a-member-1 a moment ago. In the client's logs we see 2 entries. The first one says that the driver has connected again to the replicaset-a-member-1, but in the response we can see that they data came from replicaset-b: Log-entry-1: Monitor thread successfully connected to server with description ServerDescription{address=replicaset-a-member-1, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=622150, setName='replicaset-b', canonicalAddress=replicaset-b-address, hosts=[replicaset-b-member-1, replicaset-b-member-2, replicaset-b-member-3, replicaset-b-member-4], passives=[], arbiters=[arbiter], primary='replicaset-b-member-2', You can see above that on the one hand the log says that the client has connected to a member of the replicaset-a, but the topology itself comes from replicaset-b. A second entry immediately follows: Log-entry-2: Expecting replica set member from set 'replicaset-a', but found one from set 'replicaset-b'. Removing replicaset-a-member-1 from client view of cluster So it looks like the driver figures out that it has connected to the wrong replica set and removes the member from the list of available ones. These two messages repeat for some time (until the real replicaset-a-member-1 starts), and then the following entry appears: Adding discovered server replicaset-a-member-1 to client view of cluster So, the replicaset-a node starts up, gets a different IP address, the driver connects to it, and everything looks correct. Unfortunately, from that point on, every query to that node ends up with an HMAC error, which says that the client tried to get to replicaset-a using the keyId from replicaset-b: {{Command failed with error 211 (KeyNotFound): 'No keys found for HMAC that is valid for time: { ts: Timestamp(XXX, 6130) }with id: replicaset-b-keyId' on server replicaset-a-member-1. The full response is {"ok": 0.0, "errmsg": "No keys found for HMAC that is valid for time: { ts: Timestamp(XXX, 6130) }with id: replicaset-b-keyId", "code": 211, "codeName": "KeyNotFound", "$clusterTime": {"clusterTime": {"$timestamp": {"t": YYY, "i": 2}}, "signature": {"hash": {"$binary": {"base64": "XYZ", "subType": "00"}}, "keyId": replicaset-a-keyId}}, "operationTime": {"$timestamp": {"t": ZZZ, "i": 2}}}}} Apparently, during the connection attempt loop on the old IP address, the driver somehow manages to retrieve the keyId from replicaset-b and saves it in the session. My suspicion is that health check (which I guess is used to get keyId as well) is executed by a separate thread that is not synchronized in any way with the thread that is responsible for log-entry-1 and log-entry-2, and that it manages to execute right between those operations. That is:
When the correct node comes back, the driver still has the keyId from another replicaset in the session and tries to use it, which ends in an error. Interestingly, sometimes this lasts for a short time (1 error), sometimes for minutes or hours, and in one case we've been observing this for many days, it stopped after a client application restart. I can't explain this yet. How to Reproduce
Out of 5 attempts, it always succeeds once, but I would appreciate it if someone else tries it.
|
| Comments |
| Comment by Githook User [ 05/Jan/24 ] |
|
Author: {'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}Message: Don't gossip cluster time from monitoring connections (#1276)
|
| Comment by Githook User [ 05/Jan/24 ] |
|
Author: {'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}Message: Don't gossip cluster time from monitoring connections (#1276)
|
| Comment by Jeffrey Yemin [ 20/Dec/23 ] |
Not sure what the server can do in if a client is actually connecting accidentally to two different deployments. Erroring is maybe the the best outcome in that situation. |
| Comment by Michał Knasiecki [ 20/Dec/23 ] |
|
jeff.yemin@mongodb.com After thinking about it, I came to the conclusion that the problem you described with mongo-s is unlikely to occur on my environment, because we do not share physical servers between mongo-s instances belonging to different clusters (as I said before - we have fewer sharded clusters so we can spread them easily among all the servers). Since pods of mongo-s of different clusters are never working on the same host, there's no chance for exchange of IP addresses between them. However, the situation you described is actually possible, and from what I understand the fix will probably require a change on the server side... |
| Comment by Michał Knasiecki [ 19/Dec/23 ] |
|
Hi jeff.yemin@mongodb.com |
| Comment by Jeffrey Yemin [ 19/Dec/23 ] |
|
Hi mknasiecki@gmail.com I'm just wondering if you have any sharded cluster use cases as well. I'm concerned that the sharded scenario is even worse, as unlike with replica set members there is no check that the driver is able to perform that can verify that all the mongos servers are from the same deployment. |
| Comment by Michał Knasiecki [ 11/Dec/23 ] |
|
Hello jeff.yemin@mongodb.com, when it comes to different ports: we tested this with two replicaset with different ports and it was fine, so yes - using different ports solves the problem (or maybe I should say: causes the problem does not appear). The thing is that we have a lot of replicasets and we do not manage them manually, but with a special orchestrator (you can think of it as a k8s operator but it's not 100% true), so switching to different ports would mean for us a lot of work and a long process of deploying. That's we're hoping for this fix in the driver.
And regarding the number of reports: when I was searching the Internet for the this error I came across several reports that resulted with a suggestion to update the server or the driver but there was no confirmation that this helped (like this). There were also people who said that after a reboot everything started working fine (like this). So I think there are some people who haven't solved this problem yet or the problem solved itself. Moreover it is very rare if you have a small scale environment. The problem also affects only virtual environments with dynamic IPs, like k8s. I have a strong feeling that most people who use K8s and Community Edition have at most few clusters there, and if someone has a lot of them, then he chooses Atlas. Although I'm not sure how it works in Percona operator - do they use same or different ports.
|
| Comment by Jeffrey Yemin [ 11/Dec/23 ] |
|
Hi mknasiecki@gmail.com We're trying to figure out why we're not getting more widespread reports of this from our other Kubernetes users. Is there anything that you're doing that seems out of the ordinary? The one thing that might be unusual is:
And in addition the replica set members run on the same port. Would it be possible, for example, to configure each replica set to connect on a distinct port? I wonder if that's what others are doing. |
| Comment by Michał Knasiecki [ 08/Dec/23 ] |
|
jeff.yemin@mongodb.com thanks for taking care of this even though you were unable to reproduce the error. I built and tested the version from this branch and it seems to work! I wasn't able to reproduce the bug on the patched version, then - just for sure - I reverted it 4.8.2 it occurred again. I'm looking forward to the release. Thanks again. |
| Comment by Jeffrey Yemin [ 07/Dec/23 ] |
|
Hi mknasiecki@gmail.com I thought a bit more about this and may have a driver fix for you. If you could build and test the code from https://github.com/mongodb/mongo-java-driver/pull/1276, I would be much obliged as I'm not in a great position to reproduce the scenario you described. |
| Comment by Jeffrey Yemin [ 07/Dec/23 ] |
|
Hi there, thank you for reaching out. The driver appears to be operating per spec, so I don't think there is a fix we can just make in the driver to address this issues. I think it would benefit from a more holistic view through other support channels, e.g.:
If you decide to pursue this further, please include a link to this issue so that others know that you already tried this avenue. Thank you! |
| Comment by PM Bot [ 02/Dec/23 ] |
|
Hi mknasiecki@gmail.com, thank you for reporting this issue! The team will look into it and get back to you soon. |