[SERVER-29958] waiting for daemon up ... Created: 03/Jul/17  Updated: 16/Aug/18  Resolved: 06/Jul/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.10
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: ludovic lachevre Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: envc, envm, openshift, rge, rps, trcf, wtc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

test


Attachments: File WiredTiger.turtle     File WiredTiger.turtle     File WiredTiger.wt     File WiredTiger.wt     File fh-mbaas-template-3node.json     Text File packages_update.txt     File repair-SERVER-29958.tar.gz    
Participants:

 Description   

hi,
we use mongodb in openshift platform.
we have replica set of 3 mongodb.
after a yum update this week the mongodb-1 doesn't want to start, it stays in waiting for daemon up ....
in the pod logs (we are in openshift so the mongodb installation is in a pod), we have this :

[root@ocpmsdev01 ~]# oc logs mongodb-1-19-afimd
=> Waiting for container IP address ... 10.241.17.65:27017
=> Waiting for local MongoDB to accept connections ...
=>  Waiting for MongoDB daemon up
note: noprealloc may hurt performance in many applications
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] MongoDB starting : pid=23 port=27017 dbpath=/var/lib/mongodb/data 64-bit host=mongodb-1-19-afimd
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] db version v3.2.10
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] modules: none
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] build environment:
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten]     distarch: x86_64
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2017-07-03T14:01:48.905+0000 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { http: { enabled: false }, port: 27017 }, processManagement: { pidFilePath: "/var/lib/mongodb/mongodb.pid" }, replication: { oplogSizeMB: 64, replSet: "rs0" }, security: { keyFile: "/var/lib/mongodb/keyfile" }, storage: { dbPath: "/var/lib/mongodb/data", mmapv1: { preallocDataFiles: false, smallFiles: true } }, systemLog: { quiet: true } }
2017-07-03T14:01:48.942+0000 I -        [initandlisten] Detected data files in /var/lib/mongodb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-07-03T14:01:48.943+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongodb/data/mongod.lock is not empty.
2017-07-03T14:01:48.945+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2017-07-03T14:01:48.946+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-07-03T14:01:48.979+0000 E STORAGE  [initandlisten] WiredTiger (0) [1499090508:979610][23:0x7f59b7c61e00], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 40960: block header checksum of 3375883998 doesn't match expected checksum of 882010232
2017-07-03T14:01:48.979+0000 E STORAGE  [initandlisten] WiredTiger (0) [1499090508:979669][23:0x7f59b7c61e00], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2017-07-03T14:01:48.979+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1499090508:979680][23:0x7f59b7c61e00], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-07-03T14:01:48.979+0000 I -        [initandlisten] Fatal Assertion 28558
2017-07-03T14:01:48.979+0000 I -        [initandlisten]
 
***aborting after fassert() failure
 
 
2017-07-03T14:01:48.991+0000 F -        [initandlisten] Got signal: 6 (Aborted).
 
 0x7f59b8a886f1 0x7f59b8a877a9 0x7f59b8a87ff1 0x7f59b4092370 0x7f59b3cf71d7 0x7f59b3cf88c8 0x7f59b8a0b041 0x7f59b87fc6a3 0x7f59b809fcac 0x7f59b809fda6 0x7f59b809fffc 0x7f59b8b21bbc 0x7f59b8b22103 0x7f59b8b1e859 0x7f59b8b23429 0x7f59b8b40ce4 0x7f59b8b78038 0x7f59b8bfecc3 0x7f59b8bff281 0x7f59b8bff48b 0x7f59b8b86d79 0x7f59b8bfb3e5 0x7f59b8bc0c2e 0x7f59b8bc0d66 0x7f59b8b7458d 0x7f59b87e38d0 0x7f59b87dfac2 0x7f59b8701a44 0x7f59b80e29bd 0x7f59b80e4faf 0x7f59b80a0b0d 0x7f59b3ce3b35 0x7f59b80df0f5
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"7F59B7C7D000","o":"E0B6F1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F59B7C7D000","o":"E0A7A9"},{"b":"7F59B7C7D000","o":"E0AFF1"},{"b":"7F59B4083000","o":"F370"},{"b":"7F59B3CC2000","o":"351D7","s":"gsignal"},{"b":"7F59B3CC2000","o":"368C8","s":"abort"},{"b":"7F59B7C7D000","o":"D8E041","s":"_ZN5mongo13fassertFailedEi"},{"b":"7F59B7C7D000","o":"B7F6A3"},{"b":"7F59B7C7D000","o":"422CAC","s":"__wt_eventv"},{"b":"7F59B7C7D000","o":"422DA6","s":"__wt_err"},{"b":"7F59B7C7D000","o":"422FFC","s":"__wt_panic"},{"b":"7F59B7C7D000","o":"EA4BBC","s":"__wt_block_extlist_read"},{"b":"7F59B7C7D000","o":"EA5103","s":"__wt_block_extlist_read_avail"},{"b":"7F59B7C7D000","o":"EA1859","s":"__wt_block_checkpoint_load"},{"b":"7F59B7C7D000","o":"EA6429"},{"b":"7F59B7C7D000","o":"EC3CE4","s":"__wt_btree_open"},{"b":"7F59B7C7D000","o":"EFB038","s":"__wt_conn_btree_open"},{"b":"7F59B7C7D000","o":"F81CC3","s":"__wt_session_get_btree"},{"b":"7F59B7C7D000","o":"F82281","s":"__wt_session_get_btree"},{"b":"7F59B7C7D000","o":"F8248B","s":"__wt_session_get_btree_ckpt"},{"b":"7F59B7C7D000","o":"F09D79","s":"__wt_curfile_open"},{"b":"7F59B7C7D000","o":"F7E3E5"},{"b":"7F59B7C7D000","o":"F43C2E","s":"__wt_metadata_cursor_open"},{"b":"7F59B7C7D000","o":"F43D66","s":"__wt_metadata_cursor"},{"b":"7F59B7C7D000","o":"EF758D","s":"wiredtiger_open"},{"b":"7F59B7C7D000","o":"B668D0","s":"_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_S2_mbbb"},{"b":"7F59B7C7D000","o":"B62AC2"},{"b":"7F59B7C7D000","o":"A84A44","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"7F59B7C7D000","o":"4659BD"},{"b":"7F59B7C7D000","o":"467FAF","s":"_ZN5mongo13initAndListenEi"},{"b":"7F59B7C7D000","o":"423B0D","s":"main"},{"b":"7F59B3CC2000","o":"21B35","s":"__libc_start_main"},{"b":"7F59B7C7D000","o":"4620F5"}],"processInfo":{ "mongodbVersion" : "3.2.10", "gitVersion" : "79d9b3ab5ce20f51c272b4411202710a082d0317", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-514.26.1.el7.x86_64", "version" : "#1 SMP Tue Jun 20 01:16:02 EDT 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7F59B7C7D000", "elfType" : 3, "buildId" : "B05156DF3AE4F3767D7F6DAD9BE5CA9A0F65C197" }, { "b" : "7FFD540F4000", "elfType" : 3, "buildId" : "19B7BE7AA38FF247C187E714559F972D7273EC37" }, { "b" : "7F59B7855000", "path" : "/lib64/libsnappy.so.1", "elfType" : 3, "buildId" : "51F03FA02A93040DDE9A516E7A4D3C8DBDF1514D" }, { "b" : "7F59B70AB000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libmozjs-38.so.rh-mongodb32", "elfType" : 3, "buildId" : "2276DD95FC4383A813E8BF836EB4BC57A1C78A82" }, { "b" : "7F59B6E58000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libstemmer.so.rh-mongodb32-0", "elfType" : 3, "buildId" : "6E2DE2E1C9C15DCE17ED2A5C8A2A6C4296F611BD" }, { "b" : "7F59B6BE5000", "path" : "/lib64/libtcmalloc.so.4", "elfType" : 3, "buildId" : "362EE669D57B9A6976CA68E7654C68DAD3D2EECC" }, { "b" : "7F59B69CF000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "FE621E91052A9A77CC263E00A8A21C2BC0867E21" }, { "b" : "7F59B6748000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libyaml-cpp.so.rh-mongodb32-0.5", "elfType" : 3, "buildId" : "2EE01087B01C80B7AEA90E7447561B6A27C659F0" }, { "b" : "7F59B64D5000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_program_options.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "F5BDBD7FA6990A70E48A2A1EF39B090BC1FB5BE6" }, { "b" : "7F59B62BE000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_filesystem.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "D2AE744AD4BCE89A8AB60EA2FA88AD70AD548D29" }, { "b" : "7F59B609D000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_thread.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "D8B2E214B63585D52BF005A4F6D3B3CD594EB55D" }, { "b" : "7F59B5E99000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_system.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "B65608C8DD0E49F86AEC219A15447E2142E415E8" }, { "b" : "7F59B5C91000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_chrono.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "3FB1E28A06A997CA93FC793B20591A82053CF3F1" }, { "b" : "7F59B598D000", "path" : "/opt/rh/rh-mongodb32/root/usr/lib64/libboost_regex.so.rh-mongodb32-1.58.0", "elfType" : 3, "buildId" : "1EAAB2EA98367A9911330B5C3DB8CBE145C07843" }, { "b" : "7F59B572C000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "1DEC80B82143A7960489C7B7AA8DDF182D6E2BE6" }, { "b" : "7F59B5523000", "path" : "/lib64/libpcrecpp.so.0", "elfType" : 3, "buildId" : "18354C1F9C320CD44B5CBAF039E4A53A9556AC21" }, { "b" : "7F59B52B5000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "8B4A33094EA982F927F4D5F84059EB073A203DB5" }, { "b" : "7F59B4ECB000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "7455CBD6F62579DA1598F1DC123F039F25466C90" }, { "b" : "7F59B4CC3000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "2AC501524AB01C3A36053233524A9B7BDF06D2E3" }, { "b" : "7F59B4ABF000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "9939B83E89981591ACBD6F85AE2020349A169C52" }, { "b" : "7F59B47B7000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "348CA28355FB67351EA0CC37170BB83FA008CFEC" }, { "b" : "7F59B44B5000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "8EAEED1A5C217B2F9F66C3ADEB53B1BCD526F65A" }, { "b" : "7F59B429F000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "E7A44AE9AAA39B04F12503BB3B170860F0EB38E2" }, { "b" : "7F59B4083000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "540DA7E8674CC3C696324B7D080703E9F71CFC9D" }, { "b" : "7F59B3CC2000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "C7DEA743FD3DA749E7453BEAB1F26D50A1A5FCAD" }, { "b" : "7F59B7A5B000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, { "b" : "7F59B3ABE000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "A96CEBCC70105674C41728A32B8269A280AA0E21" }, { "b" : "7F59B38B9000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "860A92A74E0D73E1D2AC5A6874C06E7F8330DC78" }, { "b" : "7F59B367B000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "4B6AD8CF2301C517766622BAA66DA2B5D85C0B05" }, { "b" : "7F59B3461000", "path" : "/lib64/libunwind.so.8", "elfType" : 3, "buildId" : "C59A5D890BE040B8B9498840A300721181990BA3" }, { "b" : "7F59B1E8D000", "path" : "/lib64/libicudata.so.50", "elfType" : 3, "buildId" : "27EA9496693BFB45C9C23DEE015ED4063FD020A1" }, { "b" : "7F59B1A8F000", "path" : "/lib64/libicui18n.so.50", "elfType" : 3, "buildId" : "B171FF3E21A20ACE392E3C48AC50BCABE9B8849A" }, { "b" : "7F59B1716000", "path" : "/lib64/libicuuc.so.50", "elfType" : 3, "buildId" : "4499237C28D849E1FA22C3D1622900746E1F2AC8" }, { "b" : "7F59B14C8000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "98F8B08BB984B5B3366F928C26585489625B829D" }, { "b" : "7F59B11E1000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "2DC7DC7094E4E26951D7B63C1F433EFE5EA06006" }, { "b" : "7F59B0FDD000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "8E6B94EAC98D4D32CA753B11E1C2CD9CE3DF3886" }, { "b" : "7F59B0DAB000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "424C66AD190AA9C519971E461C7BC3ABFBF17E51" }, { "b" : "7F59B0B9C000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "3D433D76A47E97AE253C57C20E9F86266F9595A1" }, { "b" : "7F59B0998000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8CA73C16CFEB9A8B5660015B9223B09F87041CAD" }, { "b" : "7F59B077E000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "59754FDA02AEF392C878E5E714127F5E2C68A891" }, { "b" : "7F59B0557000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "4D7CA6EFC2D57A25B1B71E3450A016AD5F220429" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f59b8a886f1]
 mongod(+0xE0A7A9) [0x7f59b8a877a9]
 mongod(+0xE0AFF1) [0x7f59b8a87ff1]
 libpthread.so.0(+0xF370) [0x7f59b4092370]
 libc.so.6(gsignal+0x37) [0x7f59b3cf71d7]
 libc.so.6(abort+0x148) [0x7f59b3cf88c8]
 mongod(_ZN5mongo13fassertFailedEi+0x91) [0x7f59b8a0b041]
 mongod(+0xB7F6A3) [0x7f59b87fc6a3]
 mongod(__wt_eventv+0x48B) [0x7f59b809fcac]
 mongod(__wt_err+0x9F) [0x7f59b809fda6]
 mongod(__wt_panic+0x24) [0x7f59b809fffc]
 mongod(__wt_block_extlist_read+0x8C) [0x7f59b8b21bbc]
 mongod(__wt_block_extlist_read_avail+0x33) [0x7f59b8b22103]
 mongod(__wt_block_checkpoint_load+0x379) [0x7f59b8b1e859]
 mongod(+0xEA6429) [0x7f59b8b23429]
 mongod(__wt_btree_open+0xA54) [0x7f59b8b40ce4]
 mongod(__wt_conn_btree_open+0x158) [0x7f59b8b78038]
 mongod(__wt_session_get_btree+0xF3) [0x7f59b8bfecc3]
 mongod(__wt_session_get_btree+0x6B1) [0x7f59b8bff281]
 mongod(__wt_session_get_btree_ckpt+0xBB) [0x7f59b8bff48b]
 mongod(__wt_curfile_open+0x99) [0x7f59b8b86d79]
 mongod(+0xF7E3E5) [0x7f59b8bfb3e5]
 mongod(__wt_metadata_cursor_open+0x6E) [0x7f59b8bc0c2e]
 mongod(__wt_metadata_cursor+0x96) [0x7f59b8bc0d66]
 mongod(wiredtiger_open+0x15ED) [0x7f59b8b7458d]
 mongod(_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_S2_mbbb+0x790) [0x7f59b87e38d0]
 mongod(+0xB62AC2) [0x7f59b87dfac2]
 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x5A4) [0x7f59b8701a44]
 mongod(+0x4659BD) [0x7f59b80e29bd]
 mongod(_ZN5mongo13initAndListenEi+0x1F) [0x7f59b80e4faf]
 mongod(main+0x16D) [0x7f59b80a0b0d]
 libc.so.6(__libc_start_main+0xF5) [0x7f59b3ce3b35]
 mongod(+0x4620F5) [0x7f59b80df0f5]
-----  END BACKTRACE  -----
/usr/bin/run-mongod: line 119:    23 Aborted                 (core dumped) mongod $mongo_common_args --replSet "${MONGODB_REPLICA_NAME}" --keyFile "${MONGODB_KEYFILE_PATH}"
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=>  Waiting for MongoDB daemon up
=> Giving up: MongoDB daemon is not up!

is someone got an idea ?

very thanks a lot

ludo



 Comments   
Comment by ludovic lachevre [ 07/Jul/17 ]

Hi Mark,

Thanks a lot for your advice.
in fact our 3 pods mongodb have persistent volume dedicated on 1 glusterfs server and this glusterfs server is replicated on 3 glusterfs that are backed up by vmware.

Thanks for everything

Ludo

Comment by Mark Agarunov [ 06/Jul/17 ]

Hello llachevre,

Thanks for your response. I'm glad to hear that this fixed the issue and everything is working again. To prevent this type of problem in the future, we recommend implementing regular backups and/or replication to mitigate any issues related to unreliable storage layers or server failures.

Thanks,
Mark

Comment by ludovic lachevre [ 04/Jul/17 ]

Hi mark,

thanks a lot for your help.
it works fine, you fixed my issue.

i will try to answer to your 7 questions.

in fact we are in an openshift platform and we installed RHMAP
mongodb replica set (3 pods) is used by rhmap mbaas
we installed rhmap mbaas with the json file joined.

1.
the storage used for mongodb is glusterfs
==> 3 linux rhel 7.3 server to respect the quorum
mongodb-1 pod is located on a linux rhel 7.3 server
mongodb-2 pod is located on a linux rhel 7.3 server
mongodb-3 pod is located on a linux rhel 7.3 server

so the storage is not local but over a network (openshift)

2.
it is difficult to test disk integrity
linux rhel server are vmware machine
volume are managed by glusterfs and in openshift, mongodb use pvc (claim) bounded on pv (50Go for each mongodb)

pvc-fed5f527-4ace-11e7-8729-0050569a54b5 50Gi RWO Delete Bound mbaas-rhmap/mongodb-claim-1 27d
pvc-feda5b8e-4ace-11e7-8729-0050569a54b5 50Gi RWO Delete Bound mbaas-rhmap/mongodb-claim-2 27d
pvc-fede1f35-4ace-11e7-8729-0050569a54b5 50Gi RWO Delete Bound mbaas-rhmap/mongodb-claim-3 27d

mongodb-claim-1 Bound pvc-fed5f527-4ace-11e7-8729-0050569a54b5 50Gi RWO 27d
mongodb-claim-2 Bound pvc-feda5b8e-4ace-11e7-8729-0050569a54b5 50Gi RWO 27d
mongodb-claim-3 Bound pvc-fede1f35-4ace-11e7-8729-0050569a54b5 50Gi RWO 27d

3.
for me the version is the same since the install :

db version v3.2.10
git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
allocator: tcmalloc
modules: none
build environment:
distarch: x86_64
target_arch: x86_64

4.
no, never
in the night of the 2 of july, there is a yum update done by spacewalk (equal redhat satellite)
no server restart
notice that all the openshift platform servers were updated together (plus or minus 1hour)

5.
no

6.
we use veeam from vmware tu backup the server
no backup for mongodb databases only
we begin with openshift and we have 2 platforms : DEV and PRD
we have just finish to install PRD and DEV is ready since 1 month but no applications runs for the moment
the 2 platforms are just installed for the moment

7.
for me, before yum update everything is ok

please joined, a list of packages updated the 2 of july

Thanks a lot

Ludo

Comment by Mark Agarunov [ 03/Jul/17 ]

Hello llachevre,

Thank you for the report. I've attached a repair attempt of the files you've provided. Would you please extract these files and replace them in your $dbpath and let us know if it resolves the issue? If you are still seeing errors after replacing these files, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?

Thanks,
Mark

Comment by ludovic lachevre [ 03/Jul/17 ]

the files you want

thanks

ludo

Comment by ludovic lachevre [ 03/Jul/17 ]

ok no problem.
in fact i think it was a mongodb bug because this is the second time that after yum update, mongodb doesn't want to sart.
i don't really know where is my problem ? (docker ? mongodb ? openshift ?) it is a little bit complex to determine that.

in my case the mongodb pod 2 and 3 are ok but not the 1, why ? i don't know.

thanks for your help so.

please joined the files you want

thanks a lot

ludo

Comment by Kelsey Schubert [ 03/Jul/17 ]

Hi llachevre,

If you can provide the files, we can attempt a repair. However, please note note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server, and this corruption was likely the result of an issue outside of MongoDB.

For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Kind regards,
Thomas

Comment by ludovic lachevre [ 03/Jul/17 ]

Hi Thomas,

thanks a lot for your quick answer.
my problem is the pod restarted so many times so i have to do this operation very quickly.

another problem is : how can i recover files from a pod to my linux server ? (outside the pod)

thanks a lot

ludo

Comment by Kelsey Schubert [ 03/Jul/17 ]

Hi llachevre,

It appears that WiredTiger.wt file has suffered some form of disk corruption. Would you please upload WiredTiger.wt and WiredTiger.turtle files so we can attempt a repair? If the files cannot be repaired, I would suggest resyncing from another node or, if that is not possible, restoring from a backup.

Kind regards,
Thomas

Generated at Thu Feb 08 04:22:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.