[SERVER-44981] Primary replica set node exited abnormally beacuse of InterruptedDueToReplStateChange Created: 06/Dec/19 Updated: 23/Dec/19 Resolved: 23/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.0.2 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | 肖 刘 | Assignee: | Dmitry Agranat |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Participants: |
| Description |
|
i have a cluster of two meachine , the primary and second node of replica set locate in different meachine. Mongodb version is V4.0.2, OS is Ubuntu 16.04.6 LTS. One primary replica set node exited abnormally and the secondy node failed to upgrade as primary node. The log of primary replica set node exited as below:
I have searched the issue list , is my problem related to SERVER-34661 and how can i prevent the InterruptedDueToReplStateChange error happening , beacuse the cluster cannot provide services if the is no primary node in replica set . |
| Comments |
| Comment by Dmitry Agranat [ 23/Dec/19 ] |
|
We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket. Regards, |
| Comment by Dmitry Agranat [ 08/Dec/19 ] |
|
The information provided is not enough to determine the root cause of the issue. The previously requested rs.status() can be retrieved from the secondary by executing rs.slaveOK() and then rs.status(). In addition, please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) from all members of the replica set in question and upload them to this support uploader location? Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time. Thanks, |
| Comment by 肖 刘 [ 06/Dec/19 ] |
|
I attach a new log of secondary node to give you more infomation. For better understanding of the log ,notice that each meachine of the cluster has five mongo process, mongos runs on port 27010, cfg_server runs on port 27011,shard1 run on port 27001,shard2 runs on port 27002,shard3 runs on port 27003. The primary node of shard3 on 192.168.0.87 meachine exited abnormally, and the shard3's secondary node is on 192.168.0.178 meachine. If you need more info please tell me. |
| Comment by 肖 刘 [ 06/Dec/19 ] |
|
Sorry, i don't run rs.status() command at that time. It was happened yesterday after work and i need to recover cluster quickly. When the error occurred i login in the second replica set with mongo --port 27003 command , but the role reamins secondary. So i restart the primary node and forget to capture more info. I can try to find some log on the secondary node. |
| Comment by Dmitry Agranat [ 06/Dec/19 ] |
|
Thanks for the report. Could you also attach the output from the rs.status() command? Thanks, |