[SERVER-14647] killed by SEGV signal Created: 22/Jul/14 Updated: 10/Dec/14 Resolved: 15/Aug/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Stability |
| Affects Version/s: | 2.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Joel Moss | Assignee: | J Rassi |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
We have a 3 member replica set, and since upgrading to 2.6.3 a few weeks ago, at least two of the three members have experiencing this error on two separate occasions:
A mongo restart fixes it. The mongo logs simply show nothing between the time this error occurred the syslog, and the time we restarted mongo.
This has happenned too many times now. Can anyone help please or give me some ideas as to where to look for more data? thx |
| Comments |
| Comment by J Rassi [ 15/Aug/14 ] | ||||||||||||||||||||||||||
|
I haven't heard back in some time, so I'm resolving this ticket as "cannot reproduce". Please re-open the ticket if you have since been able to reproduce this issue with the verbose log. | ||||||||||||||||||||||||||
| Comment by J Rassi [ 25/Jul/14 ] | ||||||||||||||||||||||||||
|
jmoss@codio.com: just checking in – have you encountered the issue since adding "-v"? | ||||||||||||||||||||||||||
| Comment by J Rassi [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
Noted. I suppose we'll wait and see if the verbose server log and/or mongostat.log contain any leads. | ||||||||||||||||||||||||||
| Comment by Joel Moss [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
Its definately not RAM. I have attached the RAM usage over the last 24 hours. This is the upstart script:
| ||||||||||||||||||||||||||
| Comment by Joel Moss [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
RAM usage | ||||||||||||||||||||||||||
| Comment by J Rassi [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
Thanks. Don't see any smoking gun yet. Can you also run "mongostat > mongostat.log" in the background on this machine, and upload the contents of this file after observing the crash once more? I'd like to see if the crash is correlated to high memory usage (perhaps it's a NULL dereference after an out-of-memory condition, which could explain the lack of crash-related log output – your mongod startup script doesn't modify /proc/<pid>/oom_adj, does it?), or a spike of a certain type of operation, etc. | ||||||||||||||||||||||||||
| Comment by Joel Moss [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
Attached dmesg (where we saw the segfault) and syslog. Also, this just happenned again. I have since appended -v to mongod, so will see more if it happens again. thx | ||||||||||||||||||||||||||
| Comment by J Rassi [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
A couple of more requests for information:
Thanks. | ||||||||||||||||||||||||||
| Comment by Joel Moss [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
mongo log file | ||||||||||||||||||||||||||
| Comment by J Rassi [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
I understand that you mentioned in the original ticket description that the log does not contain output at the time of crash. That being said, note that the log still provides a wealth of valuable context (startup information, warnings, replica set election information, an idea of what "normal usage" looks like), so please do reconsider my request for uploading it to the ticket. The log of the member that crashed while in state primary would be the most helpful of the three. In addition:
| ||||||||||||||||||||||||||
| Comment by Joel Moss [ 22/Jul/14 ] | ||||||||||||||||||||||||||
thx | ||||||||||||||||||||||||||
| Comment by J Rassi [ 22/Jul/14 ] | ||||||||||||||||||||||||||
|
Hi, I'll need additional information to further diagnose this issue:
Thanks. ~ Jason Rassi |