[SERVER-21272] All NotMaster and NetworkError retries should wait before retrying Created: 03/Nov/15  Updated: 13/Oct/16  Resolved: 11/Nov/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.2.0-rc3

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Kaloian Manassiev
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-26590 No primary, insert through mongoS han... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding C (11/20/15)
Participants:

 Description   

Currently not all places where we retry on NotMaster and NetworkErrors wait before retrying, which causes them to see the exact same replication state, because failover hasn't had chance to complete.

We should audit all places where we do retries and have them use timeout of 500 msec.



 Comments   
Comment by Githook User [ 11/Nov/15 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-21272 Make RemoteCommandTargeter use timeout for findHost

This change removes all back-off logic from ShardRegistry and
CatalogManagerReplicaSet and defers it all to the wait time capability of
the ReplicaSetMonitor (through RemoteCommandTargeter).
Branch: master
https://github.com/mongodb/mongo/commit/a707b9852bf6e03e7d6e6ef3ad464dbd28d690fa

Comment by Githook User [ 11/Nov/15 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-21272 Make replica set monitor retry finding hosts

This change makes the replica set monitor retry more than once to find
hosts suitable for a given read preference and fail quickly if none of the
hosts for a given replica set can be reached.
Branch: master
https://github.com/mongodb/mongo/commit/67b68b5f094d88753ae2fe14f6d708c9e5b4bfbd

Comment by Kaloian Manassiev [ 04/Nov/15 ]

I was considering them as separate requirements, but now that I think of it, that's a good point and we can consolidate waiting in the findHost logic of the targeter. We could pass the maxTimeMS value to it and it just won't return until it has found an appropriate host for the given read preference.

Comment by Andy Schwerin [ 03/Nov/15 ]

Should we sleep, or wait on a state transition in the replset monitor?

Generated at Thu Feb 08 03:56:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.