[SERVER-79425] Internal clients add the node's hostname to client metadata when connecting to others Created: 27/Jul/23  Updated: 26/Sep/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Eric Sedor Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Service Arch
Sprint: Service Arch 2023-08-21, Service Arch 2023-09-04
Participants:

 Description   

Currently, when connecting to other nodes in a cluster, internal clients identify themselves to other nodes by driver.name of the form "NetworkInterfaceTL-*" in client metadata.

It would be a quick-win for traceability in sharded clusters if nodes also identified themselves by hostname, using the same value as the node's own hostInfo().hostname.



 Comments   
Comment by Phoebe Du [ 26/Sep/23 ]

Hi eric.sedor@mongodb.com I'm sending this ticket to the backlog for now and it'll be addressed with the Observability Infrastructure work.

Comment by Eric Sedor [ 20/Sep/23 ]

george.wangensteen@mongodb.com the core of this request is for the connecting mongos or mongod to identify itself logically, so that we understand what process is connecting from a cluster topology perspective without having to inspect network topology at all. So I agree that just hostname might not be enough.

I'm also not at all clear what various network gateways might cause a hostname to resolve to.

We want to know just by consuming the log file that operations are coming from "shard 4, node 2" for example. So, an id that included both replica set name and replica set id feels like the best option for shard members. But that wouldn't by itself help with mongoses.

Comment by George Wangensteen [ 13/Sep/23 ]

Also I will put this on the backlog for now, but please change the status to "needs scheduling" after you reply so we can re-triage this! Thanks eric.sedor@mongodb.com

Comment by George Wangensteen [ 13/Sep/23 ]

Hey eric.sedor@mongodb.com - I definitely think we can do something here and agree that `NetworkInterfaceTL-` is not the ideal driver name prefix.
I wanted to dig into the requirements a bit more before we start typing.
Today, as you mentioned, connections from one MongoDB server to another usually send NetworkInterfaceTL-<SomeMoreSpecificName, like ShardRegistry> as the driver-name for the connection. This is then logged on the server accepting the connection as part of the client-metadata log line: (https://github.com/mongodb/mongo/blob/35a4ab98a2378bc5699abe9cdc1e554da7ec79d7/src/mongo/rpc/metadata/client_metadata.cpp#L418-L424 ). Practically, this looks something like this:

{"t":{"$date":"2023-08-16T19:50:48.215+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn73","msg":"client metadata","attr":{"remote":"10.2.0.201:34874","client":"conn73","doc":{"driver":{"name":"NetworkInterfaceTL-ReplNodeDbWorkerNetwork","version":"7.1.0-alpha-2692-ga21a9d1-sys-perf-patch-64da74762a60edbc37ae65f4"},"os":{"type":"Linux","name":"Amazon Linux release 2 (Karoo)","architecture":"aarch64","version":"Kernel 4.14.294-220.533.amzn2.aarch64"}}}}

As part of the "remote" part of this log-line, the server accepting the connection should also already log the host of the remote server that is initiating the connection. However, it looks like we usually log it as an IP address instead of a hostname.

This has a few drawbacks I can think of vs. hostname:

  • A single hostname could have multiple IPs
  • DNS might change, so the hostname <--> IP relationship changes
  • It's just less readable, so it requires some post-processing to analyze / is harder to "just look at"

Does this align with your understanding/is the above what makes the status-quo hard? I think I agree that hostname would help if so.

However, one thing I am concerned with is multiple servers running on the same host. In that case, I don't think hostname would disambiguate between them when they both connect to the same node. Would it be ideal to handle this case as well, i.e. by always some sort of unique ID that a server attaches as metadata when talking to other nodes? If we did go that route, would we want the ID to persist across restarts, etc? Or does this not happen enough in practice to be worth worrying about.

Let me know what you think about the above and thanks!

Generated at Thu Feb 08 06:40:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.