[SERVER-20671] step down should resend heartbeats if secondaries are not caught up Created: 28/Sep/15 Updated: 25/Jan/17 Resolved: 30/Sep/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.1.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Benety Goh | Assignee: | Benety Goh |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | RPL A (10/09/15) | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
When a primary is requested to step down without {force: true}and secondaries are not caught up, it has to wait until the previously scheduled heartbeats are run to obtain updated liveness information on the secondaries before completing the step down process. This may take a while with a long heartbeat interval. Restarting the heartbeats if the primary cannot step down immediately will ensure that we get the most update information on the secondaries in the cluster. |
| Comments |
| Comment by Githook User [ 01/Oct/15 ] |
|
Author: {u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}Message: |
| Comment by Githook User [ 30/Sep/15 ] |
|
Author: {u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}Message: This re-applies commit 3331d34e110f47b5ef27eff74c7c302483fcc8f9 and also fixes a race condition |
| Comment by Githook User [ 30/Sep/15 ] |
|
Author: {u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}Message: |
| Comment by J Rassi [ 29/Sep/15 ] |
|
3331d34e introduced a hang in StepDownTest::StepDownCatchUp. On my desktop, I was able to reproduce this hang on 4/500 runs of repl_coordinator_impl_test when compiling against this commit, and was unable to reproduce this hang after 500 runs when compiling against the parent commit. See also recent hangs of the compile suite on Evergreen (task, task, task, task, task, task, task). I've reverted this commit above. benety.goh, please investigate when you get a chance. |
| Comment by Githook User [ 29/Sep/15 ] |
|
Author: {u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}Message: Revert " This reverts commit 3331d34e110f47b5ef27eff74c7c302483fcc8f9. |
| Comment by Andy Schwerin [ 28/Sep/15 ] |
|
Please explain why in the description. |
| Comment by Githook User [ 28/Sep/15 ] |
|
Author: {u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}Message: |