[SERVER-30788] Mongodb 3.4.4 - Invalid Access at Address - Signal: 11 Segmentation fault Created: 23/Aug/17  Updated: 05/Sep/19  Resolved: 05/Sep/19

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jonason Ho Assignee: Keith Bostic (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File server1_0802_crash_log.txt     Text File server1_0817_crash_log.txt     Text File server2_0713_crash_log.txt     Text File server2_0811_crash_log.txt    
Operating System: ALL
Sprint: Storage 2017-09-11
Participants:
Case:

 Description   

We have a 2-server replicate set. We've been having issue where mongo would crash after a few weeks. Both servers experienced this issue.



 Comments   
Comment by Keith Bostic (Inactive) [ 23/Aug/17 ]

I think this is likely memory corruption, we should review the system logs to see if there are any reported problems, and perhaps check for heat problems in the data center.

I suspect bad memory because:

  • there are 3 different stacks, all of the stacks are in heavily exercised WiredTiger code paths,
  • these are all unique failures for the MongoDB release (we haven't had any other customer or user experience segment faults in these paths), and
  • although it's difficult to prove, the reference that's likely failing shouldn't have been the first reference to that memory location in the particular code path.

anonymous.user notes these failures are from different servers, which makes hardware failure less likely, but regardless, I think this is unlikely to be a software bug.

Comment by Alexander Gorrod [ 23/Aug/17 ]

keith.bostic Could you take a look at this please? I'm not sure if a NULL is possible for cell, this code path passes a NULL in for the length check, but that comes via the WT_ADDR structure in __wt_ref_info that can't be NULL in this code path..

Generated at Thu Feb 08 04:25:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.