[SERVER-21272] All NotMaster and NetworkError retries should wait before retrying Created: 03/Nov/15 Updated: 13/Oct/16 Resolved: 11/Nov/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.2.0-rc3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Kaloian Manassiev |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Sharding C (11/20/15) | ||||||||
| Participants: | |||||||||
| Description |
|
Currently not all places where we retry on NotMaster and NetworkErrors wait before retrying, which causes them to see the exact same replication state, because failover hasn't had chance to complete. We should audit all places where we do retries and have them use timeout of 500 msec. |
| Comments |
| Comment by Githook User [ 11/Nov/15 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: This change removes all back-off logic from ShardRegistry and |
| Comment by Githook User [ 11/Nov/15 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: This change makes the replica set monitor retry more than once to find |
| Comment by Kaloian Manassiev [ 04/Nov/15 ] |
|
I was considering them as separate requirements, but now that I think of it, that's a good point and we can consolidate waiting in the findHost logic of the targeter. We could pass the maxTimeMS value to it and it just won't return until it has found an appropriate host for the given read preference. |
| Comment by Andy Schwerin [ 03/Nov/15 ] |
|
Should we sleep, or wait on a state transition in the replset monitor? |