[SERVER-40760] Unable to start mongo after unclean shutdown Created: 22/Apr/19  Updated: 23/Apr/19  Resolved: 23/Apr/19

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.4.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: jeff langston Assignee: Danny Hatcher (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     File repair-attempt.tar    
Operating System: ALL
Participants:

 Description   

Greetings,

I have a mongo instance that had ran out of storage room. Is the error recoverable?

I have attempted to run mongod with the --repair option, but both yield similar results:

 
mongod --dbpath /mongodb/data --repair
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] MongoDB starting : pid=4101 port=27017 dbpath=/mongodb/data 64-bit host=FADAGADB01T.gpaupd.local
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] db version v3.4.2
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] allocator: tcmalloc
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] modules: none
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] build environment:
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten]     distmod: rhel70
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten]     distarch: x86_64
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten]     target_arch: x86_64
2019-04-22T23:05:03.840+1000 I CONTROL  [initandlisten] options: { repair: true, storage:

{ dbPath: "/mongodb/data" }

}
2019-04-22T23:05:03.865+1000 I -        [initandlisten] Detected data files in /mongodb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2019-04-22T23:05:03.865+1000 I STORAGE  [initandlisten] Detected WT journal files.  Running recovery from last checkpoint.
2019-04-22T23:05:03.865+1000 I STORAGE  [initandlisten] journal to nojournal transition config: create,cache_size=15576M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2019-04-22T23:05:03.877+1000 E STORAGE  [initandlisten] WiredTiger error (0) [1555938303:877190][4101:0x7f9437d11dc0], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 118784: block header checksum of 2037539182 doesn't match expected checksum of 3071916361
2019-04-22T23:05:03.877+1000 E STORAGE  [initandlisten] WiredTiger error (0) [1555938303:877238][4101:0x7f9437d11dc0], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2019-04-22T23:05:03.877+1000 E STORAGE  [initandlisten] WiredTiger error (-31804) [1555938303:877254][4101:0x7f9437d11dc0], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2019-04-22T23:05:03.877+1000 I -        [initandlisten] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2019-04-22T23:05:03.877+1000 I -        [initandlisten]
 
***aborting after fassert() failure
 
 
2019-04-22T23:05:03.896+1000 F -        [initandlisten] Got signal: 6 (Aborted).
 



 Comments   
Comment by Danny Hatcher (Inactive) [ 23/Apr/19 ]

Jeff,

Unfortunately, there's nothing more we can do in the case. You may wish to try running a -repair with a 4.0.9 mongod binary as that version of -repair had many improvements but you may encounter other version-related errors.

In the future, we strongly recommend utilizing a replica set as one node failing can still be recoverable by syncing from another member of the set.

Danny

Comment by jeff langston [ 23/Apr/19 ]

Hi Danny,

Thank you for your response. Unfortunately I still get the following:

mongod --dbpath /mongodb/data
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] MongoDB starting : pid=24887 port=27017 dbpath=/mongodb/data 64-bit host=FADAGADB01T.gpaupd.local
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] db version v3.4.2
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] allocator: tcmalloc
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] modules: none
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] build environment:
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] distmod: rhel70
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] distarch: x86_64
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] target_arch: x86_64
2019-04-23T21:28:51.381+1000 I CONTROL [initandlisten] options: { storage:

{ dbPath: "/mongodb/data" }

}
2019-04-23T21:28:51.407+1000 I - [initandlisten] Detected data files in /mongodb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2019-04-23T21:28:51.407+1000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=15576M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2019-04-23T21:28:51.942+1000 E STORAGE [initandlisten] WiredTiger error (0) [1556018931:942197][24887:0x7ff85d4a9dc0], file:sizeStorer.wt, txn-recover: read checksum error for 4096B block at offset 290816: block header checksum of 3294406552 doesn't match expected checksum of 2070625740
2019-04-23T21:28:51.942+1000 E STORAGE [initandlisten] WiredTiger error (0) [1556018931:942274][24887:0x7ff85d4a9dc0], file:sizeStorer.wt, txn-recover: sizeStorer.wt: encountered an illegal file format or internal value
2019-04-23T21:28:51.942+1000 E STORAGE [initandlisten] WiredTiger error (-31804) [1556018931:942290][24887:0x7ff85d4a9dc0], file:sizeStorer.wt, txn-recover: the process must exit and restart: WT_PANIC: WiredTiger library panic
2019-04-23T21:28:51.942+1000 I - [initandlisten] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2019-04-23T21:28:51.942+1000 I - [initandlisten]

***aborting after fassert() failure

2019-04-23T21:28:51.963+1000 F - [initandlisten] Got signal: 6 (Aborted).

 

I get a similar response when I run with the --repair option.

 

Just to make sure I copied the files over I ran md5sum:

md5sum /mongodb/data/WiredTiger.wt
662e1a3c19ae33495857e2c5ba86e332 /mongodb/data/WiredTiger.wt

md5sum /mongodb/repair/WiredTiger.wt
662e1a3c19ae33495857e2c5ba86e332 /mongodb/repair/WiredTiger.wt

 

md5sum /mongodb/data/WiredTiger.turtle
239e3fbe2f42a6d3b39dc8d165c8a2a1 /mongodb/data/WiredTiger.turtle

md5sum /mongodb/repair/WiredTiger.turtle
239e3fbe2f42a6d3b39dc8d165c8a2a1 /mongodb/repair/WiredTiger.turtle

 

-Jeff

Comment by Danny Hatcher (Inactive) [ 22/Apr/19 ]

Hello Jeff,

I've attached a repair attempt: repair-attempt.tar. Please extract those files to your $dbpath and attempt to start your mongod process.

Thanks,

Danny

Generated at Thu Feb 08 04:55:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.