[SERVER-49318] MongoDB repair failed due to Invariant failure rs.get() src/mongo/db/catalog/database.cpp Created: 05/Jul/20  Updated: 02/Aug/20  Resolved: 02/Aug/20

Status: Closed
Project: Core Server
Component/s: Index Maintenance, WiredTiger
Affects Version/s: 3.4.24
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Sirpa Vivek Assignee: Dmitry Agranat
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

MongoDB Version: 3.4.24
MongoDB hosted on Linux server was abruptly shut down due to the over-utilization of memory.
Initiated the mongodb repair using: sudo mongod -f /etc/mongodrepair.conf --repair

Repair Logs

 
  2020-07-04T17:17:07.441+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$ 
 2020-07-04T17:17:07.448+0000 I INDEX [initandlisten] build index on: ZionsBank.summary properties: { v: 1, key:
 
{ totalVolume: -1 }
 
, name: "totalV$
 lume_-1", ns: "ZionsBank.summary", background: true }
 2020-07-04T17:17:07.448+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$ 
 2020-07-04T17:17:07.456+0000 I INDEX [initandlisten] build index on: ZionsBank.summary properties: { v: 1, key:
 
{ ts: -1 }
 
, name: "ts_-1", ns: "Zi$
 nsBank.summary", background: true } 
 2020-07-04T17:17:07.456+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$ 
 2020-07-04T17:17:08.673+0000 I - [initandlisten] Invariant failure rs.get() src/mongo/db/catalog/database.cpp 195 
 2020-07-04T17:17:08.673+0000 I - [initandlisten]
 
***aborting after invariant() failure
 
2020-07-04T17:17:08.717+0000 F - [initandlisten] Got signal: 6 (Aborted). 
  

Restart Logs

  2020-07-04T17:39:14.476+0000 I CONTROL [main] ***** SERVER RESTARTED ***** 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] MongoDB starting : pid=20485 port=27017 dbpath=/home/db324 64-bit host=ip-*--**-* 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] db version v3.4.24 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] allocator: tcmalloc 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] modules: none 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] build environment: 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] distarch: x86_64 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] target_arch: x86_64 
 2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net:
 
{ bindIp: "-.-.-.-", port: 27017 }
 
, replication: $ 
 oplogSizeMB: 10240, replSetName: "rs1" }, storage: { dbPath: "/home/db324", directoryPerDB: true, engine: "wiredTiger", journal:
 
{ enabled$ true }
 
, wiredTiger: { engineConfig:
 
{ cacheSizeGB: 108.0 }
 
} }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.lo$
 " } } 
 2020-07-04T17:39:14.480+0000 W - [initandlisten] Detected unclean shutdown - /home/db324/mongod.lock is not empty. 
 2020-07-04T17:39:14.499+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 
 2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten] 
 2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten] ** WARNING: The configured WiredTiger cache size is more than 80% of available RAM. 
 2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=110592M,session_max=20000,eviction=(threads_min=4,t$
 reads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000)$
 checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress), 
 2020-07-04T17:39:14.667+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:667272][20485:0x7fdbcb287580], txn-recover: Main recovery loop$
 starting at 73368/128 
 2020-07-04T17:39:14.667+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:667951][20485:0x7fdbcb287580], txn-recover: Recovering log 733$
 8 through 73369 
 2020-07-04T17:39:14.733+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:733044][20485:0x7fdbcb287580], txn-recover: Recovering log 733$
 9 through 73369 
 2020-07-04T17:39:15.164+0000 E STORAGE [initandlisten] WiredTiger error (-31802) [1593884355:164908][20485:0x7fdbcb287580], [file:ZionsBank/collectio$|file://zionsbank/collectio$]
 -56-3854974571131417844.wt, WT_SESSION.open_cursor: /home/db324/ZionsBank/collection-56-3854974571131417844.wt: handle-read: pread: failed $
 to read 4096 bytes at offset 28672: WT_ERROR: non-specific WiredTiger error 
 2020-07-04T17:39:15.164+0000 I - [initandlisten] Invariant failure: ret resulted in status UnknownError: -31802: WT_ERROR: non-specific WiredT$
 ger error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 113 
 2020-07-04T17:39:15.165+0000 I - [initandlisten]
***aborting after invariant() failure

{{}}

Is there a way to repair/recover the last part of the DB? or
Is there a way to ignore the broken DB? or
Is it possible to cut out the whole 2.4TB data sans the last error db and create a new MongoDB instance with 2.4TB?

{{}}

I would greatly appreciate the help.
Thanks in Advance

{{}}



 Comments   
Comment by Dmitry Agranat [ 02/Aug/20 ]

Hi sirpa.vivek@gmail.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Dima

Comment by Dmitry Agranat [ 26/Jul/20 ]

Hi sirpa.vivek@gmail.com,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please upload:

  • Full log from the original failure and full log from the restart attempt. I've created a secure portal for you.
  • The $dbpath/diagnostic.data directory (the contents are described here)
  • Copies of the wiredTiger.wt and wiredTiger.turtle files and we can attempt a metadata-only repair effort using internal tools.

Thanks,
Dima

Comment by Dmitry Agranat [ 16/Jul/20 ]

Hi sirpa.vivek@gmail.com,

As MongoDB 3.4 has reached EOL, we can try to recover the data as a one-time exception. As of today, the latest MongoDB version is 4.2.8.

Is this node a part of the replica set?

Please attach the following information:

  • Full log from the original failure and full log from the restart attempt. I've created a secure portal for you.
  • The $dbpath/diagnostic.data directory (the contents are described here)
  • Copies of the wiredTiger.wt and wiredTiger.turtle files and we can attempt a metadata-only repair effort using internal tools.

Keep in mind that this repair effort may not be successful, and that diagnosing corruption issues requires significant information and effort.

Thanks,
Dima

Generated at Thu Feb 08 05:19:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.