Similar to CDRIVER-2075 (retry isMaster calls once), I think it may be worthwhile if _mongoc_cluster_check_interval() made one attempt to reestablish a closed stream before returning an error.
Using the basic case of a standalone topology as an example, I created a basic PHP script that connects to localhost:27017 and issues a ping command and served this through a single-worker web server to isolate our libmongoc client persistence to all requests. When I restarted the mongod server between requests (within the span of two seconds), the socket is left in the CLOSE_WAIT state and the subsequent request encounters a "Stream is closed" error.
socketCheckIntervalMS does not seem relevant here, as the _mongoc_cluster_check_interval() will return a "Stream is closed" error before deciding if socketCheckIntervalMS warrants an isMaster command.
The original test used our default server selection (serverSelectionTryOnce=true). When I switched to serverSelectionTryOnce=false, I did not notice any change in behavior. Is that by design, since libmongoc has technically left the server selection loop at this point? If that is the case, I think it supports the idea of retrying connections in this case.
I did notice that tuning heartbeatFrequencyMS lower to ensure a topology update between restarting mongod and the subsequent PHP request did avoid a "Stream is closed" error being reported to the user.
Note: if this issue is worth implementing, we should probably also revise the SDAM spec accordingly. I can create that ticket if needed.
- depends on
-
DRIVERS-390 Call "ping" on a socket that has been idle for socketCheckIntervalMS
- Closed
- is depended on by
-
PHPC-1296 Call "ping" on a socket that has been idle for socketCheckIntervalMS
- Closed
- is related to
-
CDRIVER-2075 Retry ismaster calls once
- Closed
- related to
-
CDRIVER-2174 _mongoc_cluster_check_interval() should invalidate nodes after detecting a closed socket
- Closed
- links to