[SERVER-47622] replSetReconfig.js should check ismaster before running the reconfig command Created: 17/Apr/20 Updated: 29/Oct/23 Resolved: 17/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.7, 4.4.0-rc2, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pavithra Vetriselvan | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v4.4, v4.2
|
||||||||||||
| Sprint: | Repl 2020-04-20 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 42 | ||||||||||||
| Description |
|
replSetReconfig.js currently runs an ismaster command on a node, but checks for the primary field. Doing so would just indicate that the node has been elected. We really want to check the ismaster field, which implies that the node is ready to accept writes. This change should be backported to all earlier affected versions. |
| Comments |
| Comment by Githook User [ 27/Apr/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: (cherry picked from commit 33d4d522231ee132a4556d76ea0cb1c4c946dde1) |
| Comment by Githook User [ 20/Apr/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: (cherry picked from commit 33d4d522231ee132a4556d76ea0cb1c4c946dde1) |
| Comment by Pavithra Vetriselvan [ 17/Apr/20 ] |
|
Note that |
| Comment by Githook User [ 17/Apr/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: |
| Comment by Pavithra Vetriselvan [ 17/Apr/20 ] |
|
siyuan.zhou, Oh that's true. I think adding the primary check definitely caused this error to manifest more frequently. But, we can still run into this in 4.2, as evidenced by BF-16958. According to the logs, it looks like we get the InterruptedDueToReplStateChange error when writing down the config document. This makes sense because storeLocalConfigDocument takes the DB lock in X mode, which means that it will be killed during step up? |
| Comment by Siyuan Zhou [ 17/Apr/20 ] |
|
pavithra.vetriselvan, do we need to backport this to 4.2? The code you changed is added in |