[SERVER-20711] Fatal assertion in secondary when sync failed due to network problems Created: 01/Oct/15 Updated: 10/Feb/16 Resolved: 01/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.0.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | samuel charron | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
We have a 3 machines mongodb setup : one primary and 2 secondaries. Setup: On all machines : mongod version v3.0.6-rc0 Log:
|
| Comments |
| Comment by samuel charron [ 02/Oct/15 ] | ||||
|
@ramon.fernandez, will you reopen this ticket ? | ||||
| Comment by samuel charron [ 01/Oct/15 ] | ||||
|
So mongod does not support temporary network problems ? #thisIsNotHowItShouldWork | ||||
| Comment by Ramon Fernandez Marina [ 01/Oct/15 ] | ||||
|
The server did not crash: it deliberately shut down (that's what the fatal assertion means) when, due to external factors (networking problems) it could not complete the necessary operations (synchronizing from another node) required to remain a part of this replica set. I'll adjust the title of the ticket to avoid confusion. Regards, | ||||
| Comment by samuel charron [ 01/Oct/15 ] | ||||
|
Are you saying that a crash is not a bug ? | ||||
| Comment by Ramon Fernandez Marina [ 01/Oct/15 ] | ||||
|
Thanks for the additional information samuel.charron, and glad to hear this secondary is back in business. It seems clear that the issues you're running into are due problems with the network and not because of a bug in mongod, so I'm going to close this ticket. Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources. Regards, | ||||
| Comment by samuel charron [ 01/Oct/15 ] | ||||
|
After restarting secondary-1, it had troubles with resynchronization with lots of log lines like these : 2015-10-01T11:04:05.691+0200 I NETWORK [rsBackgroundSync] Socket recv() timeout 1.2.3.4:27017 But eventually it stopped, and the secondary is now running ok without troubles. The full log does not look so interesting. 2015-09-29T15:56:29.514+0200 I NETWORK [initandlisten] connection accepted from 1.2.3.4:57700 #62819 (11 connections now open) every 15 seconds (alternating between primary and secondary-2 connections) | ||||
| Comment by Ramon Fernandez Marina [ 01/Oct/15 ] | ||||
|
samuel.charron, the following lines in the log:
indicate that this node could not talk to secondary-2 to sync. This could have been caused by a transient network problem for example. What happens when you restart secondary-1? Does the problem happen again? Can you send us full logs for secondary-1 so we can take a closer look? Thanks, |