[SERVER-28242] WiredTiger.wt file corrupted after unclean shutdown, cannot recover Created: 08/Mar/17  Updated: 14/Aug/18  Resolved: 13/Mar/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.9, 3.4.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Gubler Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: envc, openshift, rge, rps, trcf, wtc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     File collection-174--8777641835294838235.wt     File db2.tgz     File repair_attempt-2.tar.gz     File repair_attempt.tar.gz    
Operating System: Linux
Participants:

 Description   

We run MongoDB 3.2.9 in an OpenShift cluster using the official RedHat pod (hence the not quite up-to-date version). For reasons unknown yet, the pod failed to stop cleanly, corrupting the WiredTiger data files.

The error is basically the same as described in SERVER-23346, SERVER-27777, SERVER-25770 and possibly others. However, none of the tickets we found includes instructions on how to fix the problem ourselves.

We copied the data files to another server running MongoDB 3.4.2, hoping for fixes in the later version, but that did not help.

file:WiredTiger.wt, connection: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error

2017-03-08T09:40:48.386+0100 I CONTROL  [main] ***** SERVER RESTARTED *****
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] MongoDB starting : pid=8437 port=27017 dbpath=/var/lib/mongodb 64-bit host=mongo
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] db version v3.4.2
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g-fips  1 Mar 2016
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] allocator: tcmalloc
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] modules: none
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] build environment:
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten]     distmod: ubuntu1604
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten]     distarch: x86_64
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten]     target_arch: x86_64
2017-03-08T09:40:48.390+0100 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "::,0.0.0.0", ipv6: true, port: 27017 }, security: { authorization: "enabled" }, storage: { dbPath: "/var/lib/mongodb", engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { engineConfig: { cacheSizeGB: 0.1 } } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log", quiet: true } }
2017-03-08T09:40:48.390+0100 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
2017-03-08T09:40:48.408+0100 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2017-03-08T09:40:48.408+0100 I STORAGE  [initandlisten] 
2017-03-08T09:40:48.408+0100 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2017-03-08T09:40:48.408+0100 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2017-03-08T09:40:48.408+0100 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=102M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-03-08T09:40:48.415+0100 E STORAGE  [initandlisten] WiredTiger error (-31802) [1488962448:415285][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error
2017-03-08T09:40:48.415+0100 E STORAGE  [initandlisten] WiredTiger error (0) [1488962448:415325][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: WiredTiger has failed to open its metadata
2017-03-08T09:40:48.415+0100 E STORAGE  [initandlisten] WiredTiger error (0) [1488962448:415330][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: This may be due to the database files being encrypted, being from an older version or due to corruption on disk
2017-03-08T09:40:48.415+0100 E STORAGE  [initandlisten] WiredTiger error (0) [1488962448:415334][8437:0x7fe40cd95cc0], file:WiredTiger.wt, connection: You should confirm that you have opened the database with the correct options including all encryption and compression options
2017-03-08T09:40:48.415+0100 I -        [initandlisten] Assertion: 28595:-31802: WT_ERROR: non-specific WiredTiger error src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 267
2017-03-08T09:40:48.418+0100 I STORAGE  [initandlisten] exception in initAndListen: 28595 -31802: WT_ERROR: non-specific WiredTiger error, terminating
2017-03-08T09:40:48.418+0100 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2017-03-08T09:40:48.418+0100 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2017-03-08T09:40:48.418+0100 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2017-03-08T09:40:48.418+0100 I CONTROL  [initandlisten] now exiting
2017-03-08T09:40:48.418+0100 I CONTROL  [initandlisten] shutting down with code:100

mongod --repair results in basically the same errors.

Could you please have a look at this?

It would also be great if you could write down instructions on how to recover the WiredTiger.wt file ourselves, because we have a second database with the same issue and I fear that this will not be the last time we'll see unclean shutdowns of OpenShift pods.

Thank you!



 Comments   
Comment by Kelsey Schubert [ 10/Sep/17 ]

Hi jmmadrid,

Would you please open a new ticket, and attach the WiredTiger.wt and WiredTiger.turtle files? This will enable us to attempt to repair the files.

Thank you,
Kelsey

Comment by Juan Manuel Madrid [ 09/Sep/17 ]

Dear Thomas, I have mongo v3.2 database that falls in WiredTiger.wt corruption after unexpected shutdown (power off the server, ubuntu 14.04), the message is,

[1504975304:538861][61203:0x7f37b5f5a740], file:WiredTiger.wt, connection: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error
[1504975304:539612][61203:0x7f37b5f5a740], file:WiredTiger.wt, connection: WiredTiger has failed to open its metadata
[1504975304:539904][61203:0x7f37b5f5a740], file:WiredTiger.wt, connection: This may be due to the database files being encrypted, being from an older version or due to corruption on disk
[1504975304:540249][61203:0x7f37b5f5a740], file:WiredTiger.wt, connection: You should confirm that you have opened the database with the correct options including all encryption and compression options
wt: WT_ERROR: non-specific WiredTiger error

There is any way to recover it? (--repair don't work).

Thanks in advance!

Comment by Kelsey Schubert [ 13/Mar/17 ]

Thanks for the update, mhutter!

Comment by Manuel Hutter [ 13/Mar/17 ]

Worked as well!

Thanks a lot for helping out!

Comment by Kelsey Schubert [ 10/Mar/17 ]

Thanks all, would you please follow the same steps with repair_attempt-2.tar.gz?

Comment by Manuel Hutter [ 09/Mar/17 ]

Hi anonymous.user

I've managed to restore the DB using your repaired Files (a simple mongod --repair did the trick).

I have attached the .turtle and .wt files from the second DB (db2.tgz) aswell, can you have a look at them too?

Thank you very much for your help!

PS: We since implemented a Backup

Comment by Tobias Brunner [ 08/Mar/17 ]

Hi anonymous.user,

Thanks a lot for helping us here!

I've taken over the restore from David and here is the result when starting MongoDB:

2017-03-08T19:37:05.755+0000 I CONTROL  [initandlisten] MongoDB starting : pid=19 port=27017 dbpath=/var/lib/mongodb/data 64-bit host=mongodb-1-nuig5
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] db version v3.2.10
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] modules: none
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] build environment:
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten]     distarch: x86_64
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2017-03-08T19:37:05.756+0000 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1", http: { enabled: false }, port: 27017 }, replication: { oplogSizeMB: 64 }, storage: { dbPath: "/var/lib/mongodb/data" }, systemLog: { quiet: true } }
2017-03-08T19:37:05.772+0000 I -        [initandlisten] Detected data files in /var/lib/mongodb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-03-08T19:37:05.778+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongodb/data/mongod.lock is not empty.
2017-03-08T19:37:05.789+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2017-03-08T19:37:05.797+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
2017-03-08T19:37:09.190+0000 E STORAGE  [initandlisten] WiredTiger (-31802) [1489001829:190320][19:0x7f775760be00], file:collection-174--8777641835294838235.wt, WT_SESSION.open_cursor: unable to read root page from file:collection-174--8777641835294838235.wt: WT_ERROR: non-specific WiredTiger error
2017-03-08T19:37:09.196+0000 I -        [initandlisten] Invariant failure: ret resulted in status UnknownError: -31802: WT_ERROR: non-specific WiredTiger error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 79
2017-03-08T19:37:09.201+0000 I CONTROL  [initandlisten] 
 0x7f77584326f1 0x7f77583cd557 0x7f77583b5e4c 0x7f77581a0760 0x7f775819ef8f 0x7f775819b062 0x7f7758199e56 0x7f775818b3c3 0x7f77580e7a34 0x7f77580eda42 0x7f7758189b57 0x7f77580aba44 0x7f7757a8c9bd 0x7f7757a8efaf 0x7f7757a4ab0d 0x7f775368db35 0x7f7757a890f5
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"7F7757627000","o":"E0B6F1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F7757627000","o":"DA6557","s":"_ZN5mongo10logContextEPKc"},{"b":"7F7757627000","o":"D8EE4C","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"7F7757627000","o":"B79760","s":"_ZN5mongo17WiredTigerSession9getCursorERKSsmb"},{"b":"7F7757627000","o":"B77F8F","s":"_ZN5mongo16WiredTigerCursorC2ERKSsmbPNS_16OperationContextE"},{"b":"7F7757627000","o":"B74062","s":"_ZN5mongo21WiredTigerRecordStore6CursorC2EPNS_16OperationContextERKS0_b"},{"b":"7F7757627000","o":"B72E56","s":"_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextENS_10StringDataES3_SsbbllPNS_14CappedCallbackEPNS_20WiredTigerSizeStorerE"},{"b":"7F7757627000","o":"B643C3","s":"_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsE"},{"b":"7F7757627000","o":"AC0A34","s":"_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKSsb"},{"b":"7F7757627000","o":"AC6A42","s":"_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE"},{"b":"7F7757627000","o":"B62B57"},{"b":"7F7757627000","o":"A84A44","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"7F7757627000","o":"4659BD"},{"b":"7F7757627000","o":"467FAF","s":"_ZN5mongo13initAndListenEi"},{"b":"7F7757627000","o":"423B0D","s":"main"},{"b":"7F775366C000","o":"21B35","s":"__libc_start_main"},{"b":"7F7757627000","o":"4620F5"}],"processInfo":{ "mongodbVersion" : "3.2.10", "gitVersion" : "79d9b3ab5ce20f51c272b4411202710a082d0317", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-514.10.2.el7.x86_64", "version" : "#1 SMP Mon Feb 20 02:37:52 EST 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7F7757627000", "elfType" : 3, "buildId" : "B05156DF3AE4F3767D7F6DAD9BE5CA9A0F65C197" }, { "b" : "7FFC4D2CD000", "elfType" : 3, "buildId" : "50D1CCF000162361F74A9D7C2ECEA856F7881F07" }, { "b" : "7F77571FF000", "path" : "/lib64/libsnappy.so.1", "elfType" : 3, "buildId" : "51F
03FA02A93040DDE9A516E7A4D3C8DBDF1514D" }, { "b" : "7F7756A55000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libmozjs-38.so.rh-mongodb32", "elfType" : 3, "buildId" : "2276DD95FC4383A813E8BF836EB4BC57A1C78A82" }, { "b" : "7F7756802000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libstemmer.so.rh-mongodb32-0", "elfType" : 3, "buildId" : "6E2DE2E1C9C15DCE17ED2A5C8A2A6C4296F611BD" }, { "b" : "7F775658F000", "path" : "/lib64/libtcmalloc.so.4", "elfType" : 3, "buildId" : "362EE669D57B9A6976CA68E7654C68DAD3D2EECC" }, { "b" : "7F7756379000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "FE621E91052A9A77CC263E00A8A21C2BC0867E21" }, { "b" : "7F77560F2000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libyaml-cpp.so.rh-mongodb32-0.5", "elfType" : 3, "buildId" : "2EE01087B01C80B7AEA90E7447561B6A27C659F0" }, { "b" : "7F7755E7F000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_program_options.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "F5BDBD7FA6990A70E48A2A1EF39B090BC1FB5BE6" }, { "b" : "7F7755C68000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_filesystem.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "D2AE744AD4BCE89A8AB60EA2FA88AD70AD548D29" }, { "b" : "7F7755A47000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_thread.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "D8B2E214B63585D52BF005A4F6D3B3CD594EB55D" }, { "b" : "7F7755843000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_system.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "B65608C8DD0E49F86AEC219A15447E2142E415E8" }, { "b" : "7F775563B000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_chrono.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "3FB1E28A06A997CA93FC793B20591A82053CF3F1" }, { "b" : "7F7755337000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_regex.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "1EAAB2EA98367A9911330B5C3DB8CBE145C07843" }, { "b" : "7F77550D6000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "1DEC80B82143A7960489C7B
7AA8DDF182D6E2BE6" }, { "b" : "7F7754ECD000", "path" : "/lib64/libpcrecpp.so.0", "elfType" : 3, "buildId" : "18354C1F9C320CD44B5CBAF039E4A53A9556AC21" }, { "b" : "7F7754C5F000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "8B4A33094EA982F927F4D5F84059EB073A203DB5" }, { "b" : "7F7754875000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "7455CBD6F62579DA1598F1DC123F039F25466C90" }, { "b" : "7F775466D000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "2AC501524AB01C3A36053233524A9B7BDF06D2E3" }, { "b" : "7F7754469000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "9939B83E89981591ACBD6F85AE2020349A169C52" }, { "b" : "7F7754161000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "348CA28355FB67351EA0CC37170BB83FA008CFEC" }, { "b" : "7F7753E5F000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "8EAEED1A5C217B2F9F66C3ADEB53B1BCD526F65A" }, { "b" : "7F7753C49000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "E7A44AE9AAA39B04F12503BB3B170860F0EB38E2" }, { "b" : "7F7753A2D000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "540DA7E8674CC3C696324B7D080703E9F71CFC9D" }, { "b" : "7F775366C000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "C7DEA743FD3DA749E7453BEAB1F26D50A1A5FCAD" }, { "b" : "7F7757405000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, { "b" : "7F7753468000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "A96CEBCC70105674C41728A32B8269A280AA0E21" }, { "b" : "7F7753263000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "860A92A74E0D73E1D2AC5A6874C06E7F8330DC78" }, { "b" : "7F7753025000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "4B6AD8CF2301C517766622BAA66DA2B5D85C0B05" }, { "b" : "7F7752E0B000", "path" : "/lib64/libunwind.so.8", "elfType" : 3, "buildId" : "C59A5D890BE040B8B9498840A300721181990BA3" }, { "b" : "7F7751837000", "path" : "/lib64/libicudata.so.50", "elfType" : 3, "buildId
" : "27EA9496693BFB45C9C23DEE015ED4063FD020A1" }, { "b" : "7F7751439000", "path" : "/lib64/libicui18n.so.50", "elfType" : 3, "buildId" : "B171FF3E21A20ACE392E3C48AC50BCABE9B8849A" }, { "b" : "7F77510C0000", "path" : "/lib64/libicuuc.so.50", "elfType" : 3, "buildId" : "4499237C28D849E1FA22C3D1622900746E1F2AC8" }, { "b" : "7F7750E72000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "98F8B08BB984B5B3366F928C26585489625B829D" }, { "b" : "7F7750B8B000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "2DC7DC7094E4E26951D7B63C1F433EFE5EA06006" }, { "b" : "7F7750987000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "8E6B94EAC98D4D32CA753B11E1C2CD9CE3DF3886" }, { "b" : "7F7750755000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "424C66AD190AA9C519971E461C7BC3ABFBF17E51" }, { "b" : "7F7750546000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "3D433D76A47E97AE253C57C20E9F86266F9595A1" }, { "b" : "7F7750342000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8CA73C16CFEB9A8B5660015B9223B09F87041CAD" }, { "b" : "7F7750128000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "59754FDA02AEF392C878E5E714127F5E2C68A891" }, { "b" : "7F774FF01000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "4D7CA6EFC2D57A25B1B71E3450A016AD5F220429" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f77584326f1]
 mongod(_ZN5mongo10logContextEPKc+0x147) [0x7f77583cd557]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xBC) [0x7f77583b5e4c]
 mongod(_ZN5mongo17WiredTigerSession9getCursorERKSsmb+0x100) [0x7f77581a0760]
 mongod(_ZN5mongo16WiredTigerCursorC2ERKSsmbPNS_16OperationContextE+0x5F) [0x7f775819ef8f]
 mongod(_ZN5mongo21WiredTigerRecordStore6CursorC2EPNS_16OperationContextERKS0_b+0x72) [0x7f775819b062]
 mongod(_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextENS_10StringDataES3_SsbbllPNS_14CappedCallbackEPNS_20WiredTigerSizeStorerE+0x3A6) [0x7f7758199e56]
 mongod(_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsE+0x1D3) [0x7f775818b3c3]
 mongod(_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKSsb+0x224) [0x7f77580e7a34]
 mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0x6E2) [0x7f77580eda42]
 mongod(+0xB62B57) [0x7f7758189b57]
 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x5A4) [0x7f77580aba44]
 mongod(+0x4659BD) [0x7f7757a8c9bd]
 mongod(_ZN5mongo13initAndListenEi+0x1F) [0x7f7757a8efaf]
 mongod(main+0x16D) [0x7f7757a4ab0d]
 libc.so.6(__libc_start_main+0xF5) [0x7f775368db35]
 mongod(+0x4620F5) [0x7f7757a890f5]
-----  END BACKTRACE  -----
2017-03-08T19:37:09.201+0000 I -        [initandlisten] 
 
***aborting after invariant() failure

I've attached the file mentioned in the error message: collection-174--8777641835294838235.wt. Maybe this helps.

And yes: We're of course looking into a way to properly backup MongoDB running on OpenShift. It's just not yet implemented.

Best regards,
Tobias

Comment by Kelsey Schubert [ 08/Mar/17 ]

Hi david.gubler,

I've attached a repair attempt of the files you've provided. Please extract and replace them in your dbpath. Unfortunately, the repair process is not ready to be shared publicly. We're tracking the work to improve the repair process in SERVER-19815.

Finally, please be aware, that if a container suffers an unclean shutdown, there is very little MongoDB can do to prevent corruption from occurring and there is no guarantee that the only files affected would be WiredTiger metadata files. Therefore, I would recommend regular backups and ensuring that you are able to perform initial syncs if needed to recover from these types of failures.

If the repair attempt is successful, please feel free to upload the files from the second mongod and I'll perform the same steps.

Kind regards,
Thomas

Generated at Thu Feb 08 04:17:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.