[SERVER-35571] Wait until all nodes become stable before checkOplogs Created: 13/Jun/18 Updated: 29/Oct/23 Resolved: 19/Jun/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.1, 4.1.1 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||
| Sprint: | Repl 2018-07-02 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 15 | ||||||||
| Description |
|
checkOplogs calls awaitReplication() for live nodes before the oplog check, but _callIsMaster() is called again and resets the _liveNodes before the check. The live nodes could be different then. Waiting for all nodes to be in Down, Primary, Secondary or Arbiter state at the very beginning of checkOplogs is a possible solution. |
| Comments |
| Comment by Githook User [ 03/Jul/18 ] |
|
Author: {'username': 'tessavitabile', 'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com'}Message: (cherry picked from commit e7f212b876f8dc3e0b9aa740d55d97b781deb263) |
| Comment by Githook User [ 19/Jun/18 ] |
|
Author: {'username': 'tessavitabile', 'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com'}Message: |
| Comment by Max Hirschhorn [ 14/Jun/18 ] |
|
spencer, in order to be able to run the dbhash check as part of ReplSetTest#stopSet() (see also |
| Comment by Spencer Brody (Inactive) [ 13/Jun/18 ] |
|
max.hirschhorn, do we run the repl set checkers when we expect nodes of the set to be down? If not, we could just call awaitReplication() with the full set of nodes in the replset, rather than just the livenodes. If we expect to have to run this when nodes are down, then we need to do something to more rigorously confirm the set of nodes that are currently alive. |
| Comment by Max Hirschhorn [ 13/Jun/18 ] |
|
If we only wait in the checkOplogs() function, then isn't it possible if we ever decide to run the dbhash check before the oplog check that we'd run into the same problem? |