When the AsyncRequestSender handles an error response, it will try to update the replica set monitor for the shard and retry up to 3 times, using the Shard object's targeter to resolve the new host and port to send the request to. However, it turns out that updateReplSetMonitor doesn't actually do anything for the particular error code InterruptedAtShutdown, which is an Interruption error, a Shutdown error, and a Retriable error, but not a NotMaster error or Network error (which is what updateReplSetMonitor is looking for). So this means that the InterruptedAtShutdown error will get propagated to the client except when:
- The original host starts up again before all three retries have been exhausted
- The router receives some other error in parallel that causes its ReplicaSetMonitor to refresh for that shard and find a new primary
We should change updateReplSetMonitor to include Shutdown errors in the list of errors that will cause the host to be marked as failed (or otherwise include InterruptedAtShutdown).
As a side note, I think the current behavior violates the retryable reads spec but it's hard to tell exactly.