[SERVER-38358] MongoDB crashing with Fatal Assertion 28558 Created: 03/Dec/18  Updated: 06/Dec/22  Resolved: 06/Dec/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.21
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Joseph Varghese Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Test


Assigned Teams:
Server Triage
Participants:

 Description   

Mongo DB has crashed after reading few thousand records, below stacktrace, doesnt seem to be an issue with space left on device as it has 964G memory still available, see detailes after the below stack trace. After restarted mongodb it worked fine, so what caused this issue? how measures we can take to prevent it from happening again?

 

2018-11-30T06:12:23.011+0000 W FTDC [ftdc] Uncaught exception in 'FileStreamFailed: Failed to write to interim file buffer for full-time diagnostic data capture: /var/lib/mongo/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (28) [1543558361:314723][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: /var/lib/mongo/index-429-232680150939889745.wt: handle-write: pwrite: failed to write 4096 bytes at offset 405504: No space left on device{color}
{color:#d04437}2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (28) [1543558361:314771][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: index-429-232680150939889745.wt: fatal checkpoint failure: No space left on device
2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (-31804) [1543558361:314781][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic{color}
2018-11-30T06:12:41.314+0000 I - [thread1] Fatal Assertion 28558
2018-11-30T06:12:41.314+0000 I - [thread1]
 
***aborting after fassert() failure
 
2018-11-30T06:12:41.349+0000 F - [thread1] Got signal: 6 (Aborted).
 
0x133e5c2 0x133d4e9 0x133dcf2 0x7ff2d80b3100 0x7ff2d7d175f7 0x7ff2d7d18ce8 0x12ba9e2 0x109b613 0x1ac3438 0x1ac3635 0x1ac3803 0x19e7f71 0x19e394a 0x1a03521 0x1a9d0da 0x1aa45ff 0x1a1c108 0x1ad0280 0x1ad0538 0x1acef4a 0x1ad17da 0x1ad227b 0x1abe250 0x1a3ab2d 0x7ff2d80abdc5 0x7ff2d7dd8c9d
----- BEGIN BACKTRACE -----
{"backtrace":[
 
{"b":"400000","o":"F3E5C2","s":"_ZN5mongo15printStackTraceERSo"}
 
,
 
{"b":"400000","o":"F3D4E9"}
 
,
 
{"b":"400000","o":"F3DCF2"}
 
,
 
{"b":"7FF2D80A4000","o":"F100"}
 
,
 
{"b":"7FF2D7CE2000","o":"355F7","s":"gsignal"}
 
,
 
{"b":"7FF2D7CE2000","o":"36CE8","s":"abort"}
 
,
 
{"b":"400000","o":"EBA9E2","s":"_ZN5mongo13fassertFailedEi"}
 
,
 
{"b":"400000","o":"C9B613"}
 
,
 
{"b":"400000","o":"16C3438","s":"__wt_eventv"}
 
,
 
{"b":"400000","o":"16C3635","s":"__wt_err"}
 
,
 
{"b":"400000","o":"16C3803","s":"__wt_panic"}
 
,
 
{"b":"400000","o":"15E7F71","s":"__wt_block_panic"}
 
,
 
{"b":"400000","o":"15E394A","s":"__wt_block_checkpoint"}
 
,
 
{"b":"400000","o":"1603521","s":"__wt_bt_write"}
 
,
 
{"b":"400000","o":"169D0DA"}
 
,
 
{"b":"400000","o":"16A45FF","s":"__wt_reconcile"}
 
,
 
{"b":"400000","o":"161C108","s":"__wt_cache_op"}
 
,
 
{"b":"400000","o":"16D0280"}
 
,
 
{"b":"400000","o":"16D0538"}
 
,
 
{"b":"400000","o":"16CEF4A"}
 
,
 
{"b":"400000","o":"16D17DA"}
 
,
 
{"b":"400000","o":"16D227B","s":"__wt_txn_checkpoint"}
 
,
 
{"b":"400000","o":"16BE250"}
 
,
 
{"b":"400000","o":"163AB2D"}
 
,
 
{"b":"7FF2D80A4000","o":"7DC5"}
 
,
 
{"b":"7FF2D7CE2000","o":"F6C9D","s":"clone"}
 
],"processInfo":{ "mongodbVersion" : "3.2.21", "gitVersion" : "1ab1010737145ba3761318508ff65ba74dfe8155", "compiledModules" : [], "uname" :
 
{ "sysname" : "Linux", "release" : "4.4.23-31.54.amzn1.x86_64", "version" : "#1 SMP Tue Oct 18 22:02:09 UTC 2016", "machine" : "x86_64" }
 
, "somap" : [ { "elfType" : 2, "b" : "400000", 2018-11-30T22:49:47.063+0000 I CONTROL [main] ***** SERVER RESTARTED *****
2018-11-30T22:49:47.069+0000 I CONTROL [initandlisten] MongoDB starting : pid=6278 port=27017 dbpath=/var/lib/mongo 64-bit host=ip-192-168-200-110

 

Output of df -H

Filesystem Size Used Avail Use% Mounted on
devtmpfs 65G 66k 65G 1% /dev
tmpfs 65G 0 65G 0% /dev/shm
/dev/xvda1 317G 250G 68G 79% /
/dev/xvdb 4.3T 3.1T 964G 76% /data



 Comments   
Comment by Ramon Fernandez Marina [ 06/Dec/18 ]

josephvarghesep, the log snippet you provided has an incomplete backtrace, so it's difficult to find more details about what's happening. That's being said, 3.2 is no longer a supported version, so I'd encourage you to upgrade to 3.4 or later and report back if the issue still persists.

Thanks,
Ramón.

Generated at Thu Feb 08 04:48:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.