-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:
-
Service Arch
-
134
When run on routers, killSessions "fans out" by forwarding the request to all shards, in addition to doing some local work.
When one of the remote operations fails, the command is considered a failure. We record whatever remotes we failed on here: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/session/kill_sessions_common.cpp#L87-L105 and append them to the command response. However, because the router-command fails with an exception when any such failure occurs: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/commands/kill_sessions_command.cpp#L146, we discard the information we collected: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/s/commands/strategy.cpp#L1293.
While we probably shouldn't return this information (host and port information) in the error response, we should at least log it, so the information isn't lost. Otherwise, we can't tell why the router command failed/on what node the forwarded operation failed. Similarly, we should consider recording the error we got from the remote, instead of collapsing it into HostUnreachable here: https://github.com/mongodb/mongo/blob/755a145ef93b6229b5887e6158c112a448672c9b/src/mongo/db/session/kill_sessions_common.cpp#L102 which makes the root cause less obvious