When recving a StaleConfigException from a remote server, a less aggressive process would be as follows :
1) If version required in exception matches the current cached version, we just need to reset the version on the connection itself. This requires passing back the desired version as a field.
2) If version required != current version, we should reload the chunk manager if required. Reloads should be explicitly rate-limited by a mutex, and on acquiring the mutex we should check to see if a newer version of the chunk manager now exists (or if the state has changed, i.e. the collection is now removed).
3) Full reloads of the database should also be explicitly rate-limited, to avoid forcing all connections to re-establish their state.
Potentially also reworking the sequence number logic to store versions-per-ns-per-shard-per-connection would also significantly reduce unnecessary round-trips to the shards.