[SERVER-33899] Replica Node Never Becomes Secondary after Recovering Created: 15/Mar/18 Updated: 07/Apr/23 Resolved: 21/Jun/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Pratiksha Aggarwal | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Participants: |
| Description |
|
Configuration - MongoDB with three nodes : Primary , Secondary,1 Secondary2 When we shutdown secondary1 node and do not bring it up for long time (say more the 10 mins) then in the output of rs.status() we will see that secondary1 is in RECOVERING state and it doesn't resume into state SECONDARY until we do not perform the manual resync steps (by empyting the data directory and restart process) Referred https://jira.mongodb.org/browse/SERVER-26360 but can't get the exact RCA or fix. |
| Comments |
| Comment by Kelsey Schubert [ 05/Apr/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi pragupta, It looks like your configured oplog size is very small, which is likely resulting in this issue. I'd recommend increasing your oplog size as I suspect that this may resolve this issue. Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group. See also our Technical Support page for additional support resources. Kind regards, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Pratiksha Aggarwal [ 28/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, Any update on this? Please look into this on priority. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Pratiksha Aggarwal [ 21/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Below is the output of rs.printReplicationInfo() configured oplog size: 128MB Also as far as I have faced the issue, every time we shut down a node then it never resumes to SECONDARY state automatically. Always goes into RECOVERING state. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 20/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi pragupta, Thanks for the logs it appears that after restarting the node it has fallen off the oplog and can no longer be resynced. To continue to investigate, would you please prove the output of rs.printReplicationInfo() so we can determine the size of the oplog window? Thank you, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Pratiksha Aggarwal [ 19/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Pratiksha Aggarwal [ 16/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MongoDB version - 3.2.11 Output of rs.status :
logs_of_node2.txt Scenario: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 15/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Can you also please provide:
Thanks, |