[SERVER-15417] Arbiter didn't elect primary if OS is unreachable (except ping) Created: 26/Sep/14 Updated: 23/Jan/15 Resolved: 23/Jan/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Stability |
| Affects Version/s: | 2.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Xavier Vdb | Assignee: | Ramon Fernandez Marina |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
My VMs is hosted with ESX. the virtual machine that hosts my master is unreachable (ack fail) MongoDB shell version: 2.6.3 my master from the secondary :
I can't even force the master ! (can't connect to ssh to modify priority) I'm stuck |
| Comments |
| Comment by Ramon Fernandez Marina [ 23/Jan/15 ] | |||||||||||||||||||||
|
xavier.vdb@gmail.com, the scenario you describe is contained in Regards, | |||||||||||||||||||||
| Comment by Xavier Vdb [ 02/Dec/14 ] | |||||||||||||||||||||
|
hey Ramon "did the database files became read-only?" Yes, journal + data | |||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 02/Dec/14 ] | |||||||||||||||||||||
|
Apologies for the late response xavier.vdb@gmail.com. I think the behavior you're observing is very similar to the one described in tickets like You mention a NFS mount that went read-only; was MongoDB hosted there? In other words, did the database files became read-only? Or was this NFS mount used for something else, but was using the "hard" option? | |||||||||||||||||||||
| Comment by Xavier Vdb [ 01/Oct/14 ] | |||||||||||||||||||||
|
i have a new info : nfs mount has been switched in read only | |||||||||||||||||||||
| Comment by Xavier Vdb [ 29/Sep/14 ] | |||||||||||||||||||||
|
log secondary : 2014-09-26T15:56:34.152+0200 [rsHealthPoll] can't authenticate to X.X.X.X_crashed:27017 (X.X.X.X) failed as internal user, error: DBClientBase::findN: transport error: X.X.X.X_crashed:27017 ns: local.$cmd query: { getnonce: 1 }2014-09-26T15:56:44.154+0200 [rsHealthPoll] DBClientCursor::init call() failed log arbiter (same traces as secondary...) : 2014-09-26T15:56:17.520+0200 [rsHealthPoll] DBClientCursor::init call() failed ... No logs on the master just before crash | |||||||||||||||||||||
| Comment by Xavier Vdb [ 29/Sep/14 ] | |||||||||||||||||||||
| |||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 26/Sep/14 ] | |||||||||||||||||||||
|
xavier.vdb@gmail.com, can you please provide the details of your replica set? Number of members, types, and priorities if applicable. Also, can you upload logs for all members going at least as far back as the moment when the master became unresponsive? Thanks, |