-
Type: Improvement
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
-
Service Arch
Right now we use a generation index to processFailure() for connection pooling. This has two unintended side effects in spurious networking conditions:
- A failed connection causes connections created after it to be considered failed as well. This could throw away good connections in situations with spurious failures.
- A failed connection causes a connection request (and thus an operation) to receive a not okay status. This can happen even if the remote host has already failed the connection by the time the request came in.
We should use either a monotonic sequence id or clock to mark each time a connection is considered "ready" (including initial creation). We should also attach these ids to requests. When a connection fails, all ids before it should be considered failed.