[SERVER-42613] getHashes should default to liveSlaves, not _slaves in replsettest.js Created: 02/Aug/19 Updated: 29/Oct/23 Resolved: 16/Sep/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.1, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | bkp | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.2, v4.0
|
||||||||
| Sprint: | Repl 2019-08-26, Repl 2019-09-09, Repl 2019-09-23 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 17 | ||||||||
| Description |
|
The getHashes() method calls getPrimary() to refresh the list of slaves. It is possible at the time this is called that no master is electable. And there's no point in checking the hashes for a node which is down. Furthermore, if we are given a list of slaves, we should not change this._master here. So if no list of slaves is provided, we should call _determineLiveSlaves() instead of getPrimary(). And if a list of slaves is provided, we should not call either method. |
| Comments |
| Comment by Githook User [ 03/Oct/19 ] |
|
Author: {'username': 'mtrussotto', 'email': 'matthew.russotto@mongodb.com', 'name': 'Matthew Russotto'}Message: (cherry picked from commit 81d2b80554331f1ca428138823d27cbf2a293c52) |
| Comment by Githook User [ 25/Sep/19 ] |
|
Author: {'username': 'mtrussotto', 'email': 'matthew.russotto@mongodb.com', 'name': 'Matthew Russotto'}Message: (cherry picked from commit 81d2b80554331f1ca428138823d27cbf2a293c52) |
| Comment by Githook User [ 16/Sep/19 ] |
|
Author: {'username': 'mtrussotto', 'email': 'matthew.russotto@mongodb.com', 'name': 'Matthew Russotto'}Message: |
| Comment by Matthew Russotto [ 02/Aug/19 ] |
|
This is a long-standing bug which hasn't happened very often; if we already have a node down when we stop a 3-node replica set, then using fsyncLock (as we do in CheckReplDBHashes) will prevent elections from happening, so if the current primary steps down for any reason (e.g. slow machine), we'll get stuck. |