[SERVER-63906] mongodb repair data error Created: 23/Feb/22  Updated: 04/May/22  Resolved: 04/May/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: li liang Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

beacause oom,mongod process has been killed

after restart mongod error, my config is under code

port=27017
bind_ip_all=true
fork=true
dbpath=/data/mongodb/data/
#dbpath=/opt/mongodb/data/
logpath=/data/mongodb/logs/mongodb.log
pidfilepath=/data/mongodb/log/mongodb.pid
#auth=true
directoryperdb=true
logappend=true
replSet=rs0
maxConns=5000
journal=true
slowms=100
profile=1
storageEngine=wiredTiger
wiredTigerDirectoryForIndexes=true
wiredTigerIndexPrefixCompression=true
wiredTigerCollectionBlockCompressor=zlib
wiredTigerJournalCompressor=zlib
oplogSize=10000

after restart error is

2022-02-23T09:54:00.141+0800 I CONTROL  [main] ***** SERVER RESTARTED *****
2022-02-23T09:54:00.141+0800 I NETWORK  [main]  --maxConns too high, can only handle 819
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] MongoDB starting : pid=12006 port=27017 dbpath=/data/mongodb/data/ 64-bit host=bigdata1
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] db version v3.6.15
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] git version: 18934fb5c814e87895c5e38ae1515dd6cb4c00f7
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] allocator: tcmalloc
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] modules: none
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] build environment:
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten]     distmod: rhel70
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten]     distarch: x86_64
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten]     target_arch: x86_64
2022-02-23T09:54:00.169+0800 I CONTROL  [initandlisten] options: { config: "/data/mongodb-linux-x86_64-rhel70-3.6.15/conf/mongodb.conf", net: { bindIpAll: true, maxIncomingConnections: 5000, port: 27017 }, operationProfiling: { mode: "slowOp", slowOpThresholdMs: 100 }, processManagement: { fork: true, pidFilePath: "/data/mongodb/log/mongodb.pid" }, replication: { oplogSizeMB: 10000, replSet: "rs0" }, storage: { dbPath: "/data/mongodb/data/", directoryPerDB: true, engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { collectionConfig: { blockCompressor: "zlib" }, engineConfig: { directoryForIndexes: true, journalCompressor: "zlib" }, indexConfig: { prefixCompression: true } } }, systemLog: { destination: "file", logAppend: true, path: "/data/mongodb/logs/mongodb.log" } }
2022-02-23T09:54:00.169+0800 I NETWORK  [initandlisten]  --maxConns too high, can only handle 819
2022-02-23T09:54:00.169+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=63891M,cache_overflow=(file_max=0M),session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),compatibility=(release="3.0",require_max="3.0"),log=(enabled=true,archive=true,path=journal,compressor=zlib),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (-31802) [1645581240:773574][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 604: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error Raw: [1645581240:773574][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 604: unable to read root page from file:WiredTiger.wt: WT_ERROR: non-specific WiredTiger error
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (0) [1645581240:773648][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 611: WiredTiger has failed to open its metadata Raw: [1645581240:773648][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 611: WiredTiger has failed to open its metadata
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (0) [1645581240:773660][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 614: This may be due to the database files being encrypted, being from an older version or due to corruption on disk Raw: [1645581240:773660][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 614: This may be due to the database files being encrypted, being from an older version or due to corruption on disk
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (0) [1645581240:773648][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 611: WiredTiger has failed to open its metadata Raw: [1645581240:773648][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 611: WiredTiger has failed to open its metadata
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (0) [1645581240:773660][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 614: This may be due to the database files being encrypted, being from an older version or due to corruption on disk Raw: [1645581240:773660][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 614: This may be due to the database files being encrypted, being from an older version or due to corruption on disk
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] WiredTiger error (0) [1645581240:773681][12006:0x7f8065a41b80], file:WiredTiger.wt, connection: __wt_btree_tree_open, 617: You should c
onfirm that you have opened the database with the correct options including all encryption and compression options Raw: [1645581240:773681][12006:0x7f8065a41b80], file:WiredTiger.wt, connecti
on: __wt_btree_tree_open, 617: You should confirm that you have opened the database with the correct options including all encryption and compression options
2022-02-23T09:54:00.773+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanl
y shut it down so that the journal format will be downgraded.
2022-02-23T09:54:00.774+0800 F STORAGE  [initandlisten] WiredTiger metadata corruption detected
2022-02-23T09:54:00.774+0800 F STORAGE  [initandlisten] This version of MongoDB is unable to repair this kind of corruption, but version 4.0.3+ may be able to repair it. See http://dochub.mon
godb.org/core/repair for more information.
2022-02-23T09:54:00.774+0800 F -        [initandlisten] Fatal Assertion 50944 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 71
2022-02-23T09:54:00.774+0800 F -        [initandlisten]

i user mongod v4.0.28 to repair

some collection look repair successful, repair stop by under error

2022-02-23T19:40:50.986+0800 I STORAGE  [initandlisten] WiredTiger progress WT_SESSION.salvage 27002022-02-23T19:40:50.986+0800 I STORAGE  [initandlisten] WiredTiger progress WT_SESSION.salvage 27002022-02-23T19:40:50.996+0800 I STORAGE  [initandlisten] WiredTiger progress WT_SESSION.salvage 28002022-02-23T19:40:51.005+0800 I STORAGE  [initandlisten] WiredTiger progress WT_SESSION.salvage 29002022-02-23T19:40:51.018+0800 I INDEX    [initandlisten] build index on: wmdtc.latest_data_yesterday properties: { v: 2, key: { _id: 1 }, name: "_id_", ns: "wmdtc.latest_data_yesterday" }2022-02-23T19:40:51.018+0800 I INDEX    [initandlisten]   building index using bulk method; build may temporarily use up to 500 megabytes of RAM2022-02-23T19:40:51.735+0800 F STORAGE  [initandlisten] Failed to repair database 'wmdtc': E11000 duplicate key error collection: wmdtc.latest_data_yesterday index: _id_ dup key: { : "00000000000000000" }2022-02-23T19:40:51.750+0800 F STORAGE  [initandlisten] Record store did not exist. Collection: wmdtc.report_360 UUID: 63cd9d20-91c6-45a2-b66e-4eaf5630a1e32022-02-23T19:40:51.750+0800 F -        [initandlisten] Fatal Assertion 50936 at src/mongo/db/catalog/database_impl.cpp 2282022-02-23T19:40:51.750+0800 F -        [initandlisten] \n\n***aborting after fassert() failure\n\n                                                     2022-02-23T21:26:55.270+0800 I CONTROL  [main] ***** SERVER RESTARTED *****2022-02-23T21:26:55.271+0800 I NETWORK  [main]  --maxConns too high, can only handle 8192022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] MongoDB starting : pid=15938 port=27017 dbpath=/data/mongodb/data/ 64-bit host=bigdata12022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] db version v3.6.152022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] git version: 18934fb5c814e87895c5e38ae1515dd6cb4c00f72022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 20132022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] allocator: tcmalloc2022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] modules: none2022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] build environment:2022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten]     distmod: rhel702022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten]     distarch: x86_642022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten]     target_arch: x86_642022-02-23T21:26:55.296+0800 I CONTROL  [initandlisten] options: { config: "/data/mongodb-linux-x86_64-rhel70-3.6.15/conf/mongodb.conf", net: { bindIpAll: true, maxIncomingConnections: 5000, port: 27017 }, operationProfiling: { mode: "slowOp", slowOpThresholdMs: 100 }, processManagement: { fork: true, pidFilePath: "/data/mongodb/log/mongodb.pid" }, replication: { oplogSizeMB: 10000, replSet: "rs0" }, storage: { dbPath: "/data/mongodb/data/", directoryPerDB: true, engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { collectionConfig: { blockCompressor: "zlib" }, engineConfig: { directoryForIndexes: true, journalCompressor: "zlib" }, indexConfig: { prefixCompression: true } } }, systemLog: { destination: "file", logAppend: true, path: "/data/mongodb/logs/mongodb.log" } }2022-02-23T21:26:55.296+0800 I NETWORK  [initandlisten]  --maxConns too high, can only handle 8192022-02-23T21:26:55.297+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=63891M,cache_overflow=(file_max=0M),session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),compatibility=(release="3.0",require_max="3.0"),log=(enabled=true,archive=true,path=journal,compressor=zlib),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),2022-02-23T21:26:55.853+0800 E STORAGE  [initandlisten] WiredTiger error (-31802) [1645622815:853836][15938:0x7fee6f698b80], connection: __log_open_verify, 1028: Version incompatibility detected: unsupported WiredTiger file version: this build requires a maximum version of 2, and the file is version 3: WT_ERROR: non-specific WiredTiger error Raw: [1645622815:853836][15938:0x7fee6f698b80], connection: __log_open_verify, 1028: Version incompatibility detected: unsupported WiredTiger file version: this build requires a maximum version of 2, and the file is version 3: WT_ERROR: non-specific WiredTiger error2022-02-23T21:26:55.853+0800 E STORAGE  [initandlisten] An unsupported journal format detected - If you are trying to rollback from version 4.0 to 3.6, please re-start a 4.0 binary and cleanly shut it down so that the journal format will be downgraded.2022-02-23T21:26:55.855+0800 E -        [initandlisten] Assertion: 28595:-31802: WT_ERROR: non-specific WiredTiger error src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 4882022-02-23T21:26:55.857+0800 I STORAGE  [initandlisten] exception in initAndListen: Location28595: -31802: WT_ERROR: non-specific WiredTiger error, terminating2022-02-23T21:26:55.857+0800 F -        [initandlisten] Invariant failure globalStorageEngine src/mongo/db/service_context.cpp 962022-02-23T21:26:55.857+0800 F -        [initandlisten] 
***aborting after invariant() failure

 



 Comments   
Comment by Edwin Zhou [ 04/May/22 ]

Hi liliang@inaservice.com.cn,

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Best,
Edwin

Comment by Edwin Zhou [ 17/Mar/22 ]

Hi liliang@inaservice.com.cn,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please perform the steps I described above, and provide the log files in the event that --repair fails?

Best,
Edwin

Comment by Edwin Zhou [ 28/Feb/22 ]

Hi liliang@inaservice.com.cn,

Thank you for your report. Prior to your first run of --repair, were you able to make a complete copy of the database's $dbpath directory to safeguard so that you can work off of the current $dbpath?

The ideal resolution is to perform a clean resync from an unaffected node.

It appears that mongod --repair failed during an index build after coming across a duplicate key error. Can you please attempt mongod -repair using the latest version of MongoDB? Please note that SERVER-39562 resolves duplicate key errors occurring during index builds when running -repair and was introduced in MongoDB v4.4.2.

In the event that a --repair operation is unsuccessful, then please also provide:

  • The logs leading up to the first occurrence of any issue
  • The logs of the repair operation.
  • The logs of any attempt to start mongod after the repair operation completed.

Best,
Edwin

Generated at Thu Feb 08 05:58:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.