Details
Description
We run MongoDB 3.2.9 in an OpenShift cluster using the official RedHat pod (hence the not quite up-to-date version). For reasons unknown yet, the pod failed to stop cleanly, corrupting the WiredTiger data files.
The error is basically the same as described in SERVER-23346, SERVER-27777, SERVER-25770 and possibly others. However, none of the tickets we found includes instructions on how to fix the problem ourselves.
We copied the data files to another server running MongoDB 3.4.2, hoping for fixes in the later version, but that did not help.
file:WiredTiger.wt, connection: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error
2017-03-08T09:40:48.386+0100 I CONTROL [main] ***** SERVER RESTARTED *****
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] MongoDB starting : pid=8437 port=27017 dbpath=/var/lib/mongodb 64-bit host=mongo
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] db version v3.4.2
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g-fips 1 Mar 2016
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] allocator: tcmalloc
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] modules: none
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] build environment:
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] distmod: ubuntu1604
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] distarch: x86_64
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] target_arch: x86_64
|
2017-03-08T09:40:48.390+0100 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "::,0.0.0.0", ipv6: true, port: 27017 }, security: { authorization: "enabled" }, storage: { dbPath: "/var/lib/mongodb", engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { engineConfig: { cacheSizeGB: 0.1 } } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log", quiet: true } }
|
2017-03-08T09:40:48.390+0100 W - [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
|
2017-03-08T09:40:48.408+0100 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
|
2017-03-08T09:40:48.408+0100 I STORAGE [initandlisten]
|
2017-03-08T09:40:48.408+0100 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
|
2017-03-08T09:40:48.408+0100 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
|
2017-03-08T09:40:48.408+0100 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=102M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
|
2017-03-08T09:40:48.415+0100 E STORAGE [initandlisten] WiredTiger error (-31802) [1488962448:415285][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error
|
2017-03-08T09:40:48.415+0100 E STORAGE [initandlisten] WiredTiger error (0) [1488962448:415325][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: WiredTiger has failed to open its metadata
|
2017-03-08T09:40:48.415+0100 E STORAGE [initandlisten] WiredTiger error (0) [1488962448:415330][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: This may be due to the database files being encrypted, being from an older version or due to corruption on disk
|
2017-03-08T09:40:48.415+0100 E STORAGE [initandlisten] WiredTiger error (0) [1488962448:415334][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: You should confirm that you have opened the database with the correct options including all encryption and compression options
|
2017-03-08T09:40:48.415+0100 I - [initandlisten] Assertion: 28595:-31802: WT_ERROR: non-specific WiredTiger error src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 267
|
2017-03-08T09:40:48.418+0100 I STORAGE [initandlisten] exception in initAndListen: 28595 -31802: WT_ERROR: non-specific WiredTiger error, terminating
|
2017-03-08T09:40:48.418+0100 I NETWORK [initandlisten] shutdown: going to close listening sockets...
|
2017-03-08T09:40:48.418+0100 I NETWORK [initandlisten] removing socket file: /tmp/mongodb-27017.sock
|
2017-03-08T09:40:48.418+0100 I NETWORK [initandlisten] shutdown: going to flush diaglog...
|
2017-03-08T09:40:48.418+0100 I CONTROL [initandlisten] now exiting
|
2017-03-08T09:40:48.418+0100 I CONTROL [initandlisten] shutting down with code:100
|
mongod --repair results in basically the same errors.
Could you please have a look at this?
It would also be great if you could write down instructions on how to recover the WiredTiger.wt file ourselves, because we have a second database with the same issue and I fear that this will not be the last time we'll see unclean shutdowns of OpenShift pods.
Thank you!