[SERVER-28601] Mongo start aborting after fassert() failure Created: 04/Apr/17  Updated: 31/May/17  Resolved: 04/Apr/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sasanka Uppu [X] Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File WiredTiger     File WiredTiger.lock     File WiredTiger.turtle     File WiredTiger.wt     File WiredTigerLAS.wt     File WiredTigerLog.0000005020     File sizeStorer.wt     File storage.bson    
Operating System: ALL
Participants:

 Description   

Detected data files in /Volumes/SeagateBackupPlusDrive/Mongodb/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-04-04T00:00:51.812-0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=9G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-04-04T00:00:51.859-0400 E STORAGE  [initandlisten] WiredTiger (-31802) [1491278451:859327][15050:0x7fff7d3d1310], connection: log file journal/WiredTigerLog.0000005020 corrupted: Bad magic number 0: WT_ERROR: non-specific WiredTiger error
2017-04-04T00:00:51.859-0400 E STORAGE  [initandlisten] WiredTiger (-31804) [1491278451:859385][15050:0x7fff7d3d1310], connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-04-04T00:00:51.859-0400 I -        [initandlisten] Fatal Assertion 28558
2017-04-04T00:00:51.859-0400 I -        [initandlisten] 
 
***aborting after fassert() failure
 
 
2017-04-04T00:00:51.864-0400 F -        [initandlisten] Got signal: 6 (Abort trap: 6).
 
----- BEGIN BACKTRACE -----



 Comments   
Comment by Kelsey Schubert [ 04/Apr/17 ]

Hi Sasanka_uppu,

I'm glad you were able to restart mongod successfully. Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Kind regards,
Thomas

Comment by Sasanka Uppu [X] [ 04/Apr/17 ]

Hiii,
I removed the WiredTigerLog files and it works!!! Thanks a lot.....
A small question Is it a best practice though to run the db like this (through external hard drive)? and also any idea how to prevent such failures in future?

Comment by Kelsey Schubert [ 04/Apr/17 ]

Hi Sasanka_uppu,

Thanks for the additional information. Removing the drive as you describe would explain this type of disk corruption. Unfortunately, there is little MongoDB can do to in these circumstances. To successfully restart mongod, please remove WiredTigerLog files greater than or equal to 5020. After removing these files, the dbpath should be back in a consistent state assuming no other corruption has occurred. This process will allow mongod to recover everything up to 5019. Please note that the more recent log files may have been written as part of the restart attempts, but if they contain real data, it would be lost.

Kind regards,
Thomas

Comment by Sasanka Uppu [X] [ 04/Apr/17 ]

Hi Thanks for the quick response..
1) Here is my Output for ls -l command in journal directory:
total 5533304
rw-rr-@ 1 username staff 104857728 Apr 3 19:03 WiredTigerLog.0000005005
rw-rr-@ 1 username staff 104857728 Apr 3 23:06 WiredTigerLog.0000005006
rw-rr-@ 1 username staff 104857728 Apr 3 23:06 WiredTigerLog.0000005007
rw-rr-@ 1 username staff 104857728 Apr 3 23:07 WiredTigerLog.0000005008
rw-rr-@ 1 username staff 104857728 Apr 3 23:08 WiredTigerLog.0000005009
rw-rr-@ 1 username staff 104857728 Apr 3 23:12 WiredTigerLog.0000005010
rw-rr-@ 1 username staff 104857728 Apr 3 23:13 WiredTigerLog.0000005011
rw-rr-@ 1 username staff 104857728 Apr 3 23:13 WiredTigerLog.0000005012
rw-rr-@ 1 username staff 104857728 Apr 3 23:16 WiredTigerLog.0000005013
rw-rr-@ 1 username staff 104857728 Apr 3 23:22 WiredTigerLog.0000005014
rw-rr-@ 1 username staff 104857600 Apr 3 23:24 WiredTigerLog.0000005015
rw-rr-@ 1 username staff 104857600 Apr 3 23:30 WiredTigerLog.0000005016
rw-rr-@ 1 username staff 104857600 Apr 3 23:46 WiredTigerLog.0000005017
rw-rr-@ 1 username staff 104857600 Apr 3 23:50 WiredTigerLog.0000005018
rw-rr-@ 1 username staff 104857600 Apr 3 23:55 WiredTigerLog.0000005019
rw-rr-@ 1 username staff 104857728 Apr 4 00:00 WiredTigerLog.0000005020
rw-rr-@ 1 username staff 104857728 Apr 4 00:18 WiredTigerLog.0000005021
rw-rr-@ 1 username staff 104857728 Apr 4 00:26 WiredTigerLog.0000005022
rw-rr-@ 1 username staff 104857728 Apr 4 00:26 WiredTigerLog.0000005023
rw-rr-@ 1 username staff 104857728 Apr 4 00:26 WiredTigerLog.0000005024
rw-rr-@ 1 username staff 104857728 Apr 4 00:36 WiredTigerLog.0000005025
rw-rr-@ 1 username staff 104857728 Apr 4 00:36 WiredTigerLog.0000005026
rw-rr-@ 1 username staff 104857728 Apr 4 00:36 WiredTigerLog.0000005027
rw-rr-@ 1 username staff 104857728 Apr 4 00:36 WiredTigerLog.0000005028
rw-rr-@ 1 username staff 104857728 Apr 4 00:38 WiredTigerLog.0000005029
rw-rr-@ 1 username staff 104857728 Apr 4 00:38 WiredTigerLog.0000005030
rw-rr-@ 1 username staff 104857728 Apr 4 00:39 WiredTigerLog.0000005031

2) Yes I opened a connection in Pymongo (Python code) and removed the drive without closing the connection.

Thanks!

Comment by Kelsey Schubert [ 04/Apr/17 ]

Hi Sasanka_uppu,

Thanks for reporting this behavior. It appears the WiredTigerLog.0000005020 file has been zeroed. I have a few questions to better understand what has happened.

  1. Would you please provide the output of 'ls -l' against your journal directory?
  2. Would you clarify under what circumstances the mongod went down prior to the error restarting? If I'm understanding your last comment, the the external drive was removed while mongod was running. Is that correct?

Thank you,
Thomas

Comment by Sasanka Uppu [X] [ 04/Apr/17 ]

Just as a heads up I have been accessing the mongodb from pymongo and forgot to close the connection before removing the external drive in which db was present.. which possibly might have caused this

Comment by Sasanka Uppu [X] [ 04/Apr/17 ]

Unfortunately doing mongodb --repair is also throwing the same error..

Comment by Sasanka Uppu [X] [ 04/Apr/17 ]

I put my mongo data base in my external drive and explicitly give the db path with the command (sudo mongod --dbpath ***) whenever i use mongo...
Now I'm facing issue mentioned above... Please Help!!

Generated at Thu Feb 08 04:18:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.