[SERVER-31690] WiredTiger.wt read checksum error Created: 24/Oct/17  Updated: 16/Nov/17  Resolved: 25/Oct/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: naveen Tyagi Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File WiredTiger     File WiredTiger.turtle     File WiredTiger.wt     File WiredTigerLAS.wt     File repair-SERVER-31690.tar.gz    
Participants:

 Description   

My server crashed and the admins didn't use dump commands to make backups, but copied the database files directly and I cannot repair the database using those files.
Is there a way to recover the data? At least partially/something?



 Comments   
Comment by Mark Agarunov [ 25/Oct/17 ]

Hello naveen.tyagi,

Thank you for the additional information. Unfortunately, this error indicates that there was corruption outside of MongoDB. In this situation, my best recommendation would be to resync the affected node or restore from a backup if possible.

From the provided information, I do not see anything in the provided information to indicate a bug in the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.

Thanks,
Mark

Comment by naveen Tyagi [ 25/Oct/17 ]

Hi Mark,

Have you seen the posted thread/issue ? Please let me know if you need more details. Please look into this ASAP.

Thanks,
Naveen

Comment by naveen Tyagi [ 24/Oct/17 ]

Hi Mark,

please find the reply of your questions.

  • What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?

Answer : Locally HDD type

  • Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.

Answer : Same version we are using (3.2.17)

  • Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?

Answer : Yes, we have copied the data files from the crashed server.

  • Have you ever restored this instance from backups?

Answer : No

  • What method do you use to create backups?

Answer : Copied the data files from server.

Comment by naveen Tyagi [ 24/Oct/17 ]

Hi Mark,

Thank you for your quick response. We ran the following command and getting below error after a while.

Command : C:\Program Files\MongoDB\Server\3.2\bin>mongod --dbpath D:/neirbi/home/mongo --repair

Output :

2017-10-24T22:09:32.817+0530 I INDEX    [initandlisten] build index on: neribi_d
b.all_user_events properties: { v: 1, unique: true, key: { parent_id: 1, event_t
ype: 1, user_id: 1, latitude: 1, longitude: 1 }, name: "parent_id_event_type_use
r_id_latitude_longitude", ns: "neribi_db.all_user_events", background: 1 }
2017-10-24T22:09:32.828+0530 I INDEX    [initandlisten]          building index
using bulk method; build may temporarily use up to 50 megabytes of RAM
2017-10-24T22:11:37.057+0530 I -        [initandlisten] Invariant failure rs.get
() src\mongo\db\catalog\database.cpp 190
2017-10-24T22:11:37.062+0530 I -        [initandlisten]
 
***aborting after invariant() failure
 
 
2017-10-24T22:11:49.722+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\util\stacktrace_windows.cpp(174)                               mongo::printS
tackTrace+0x43
2017-10-24T22:11:49.731+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\util\signal_handlers_synchronous.cpp(182)                      mongo::`anony
mous namespace'::printSignalAndBacktrace+0x73
2017-10-24T22:11:49.742+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\util\signal_handlers_synchronous.cpp(238)                      mongo::`anony
mous namespace'::abruptQuit+0x74
2017-10-24T22:11:49.753+0530 I CONTROL  [initandlisten] mongod.exe    f:\dd\vcto
ols\crt\crtw32\misc\winsig.c(587)                                  raise+0x1e9
2017-10-24T22:11:49.760+0530 I CONTROL  [initandlisten] mongod.exe    f:\dd\vcto
ols\crt\crtw32\misc\abort.c(82)                                    abort+0x18
2017-10-24T22:11:49.767+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\util\assert_util.cpp(154)                                      mongo::invari
antFailed+0x185
2017-10-24T22:11:49.778+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\catalog\database.cpp(190)                                   mongo::Databa
se::_getOrCreateCollectionInstance+0xbf
2017-10-24T22:11:49.789+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\catalog\database.cpp(214)                                   mongo::Databa
se::Database+0x25b
2017-10-24T22:11:49.799+0530 I CONTROL  [initandlisten] mongod.exe    c:\program
 files (x86)\microsoft visual studio 12.0\vc\include\memory(1639)  std::make_uni
que<mongo::Database,mongo::OperationContext * __ptr64 & __ptr64,mongo::StringDat
a const & __ptr64,mongo::DatabaseCatalogEntry * __ptr64 & __ptr64>+0x5e
2017-10-24T22:11:49.814+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\catalog\database_holder.cpp(155)                            mongo::Databa
seHolder::openDb+0x282
2017-10-24T22:11:49.825+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\repair_database.cpp(214)                                    <lambda_59734
447cc71eafee343a46c6b85bdda>::operator()+0x73
2017-10-24T22:11:49.835+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\util\scopeguard.h(99)                                          mongo::ScopeG
uardImplBase::SafeExecute<mongo::ScopeGuardImpl0<<lambda_59734447cc71eafee343a46
c6b85bdda> > >+0x1b
2017-10-24T22:11:49.849+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\repair_database.cpp(246)                                    mongo::repair
Database+0x5d9
2017-10-24T22:11:49.860+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\db.cpp(409)                                                 mongo::`anony
mous namespace'::repairDatabasesAndCheckVersion+0x2f6
2017-10-24T22:11:49.871+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\db.cpp(684)                                                 mongo::`anony
mous namespace'::_initAndListen+0x108c
2017-10-24T22:11:49.882+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\db.cpp(798)                                                 mongo::`anony
mous namespace'::initAndListen+0x27
2017-10-24T22:11:49.893+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\db.cpp(1033)                                                mongoDbMain+0
x216
2017-10-24T22:11:49.903+0530 I CONTROL  [initandlisten] mongod.exe    ...\src\mo
ngo\db\db.cpp(839)                                                 wmain+0x35
2017-10-24T22:11:49.909+0530 I CONTROL  [initandlisten] mongod.exe    f:\dd\vcto
ols\crt\crtw32\startup\crt0.c(255)                                 __tmainCRTSta
rtup+0x144
2017-10-24T22:11:49.919+0530 I CONTROL  [initandlisten] kernel32.dll
                                                                   BaseThreadIni
tThunk+0xd
2017-10-24T22:11:49.929+0530 F -        [initandlisten] Got signal: 22 (SIGABRT)
.
2017-10-24T22:11:49.937+0530 I CONTROL  [initandlisten] failed to open minidump
file C:\Program Files\MongoDB\Server\3.2017-10-24T16-41-49.mdmp : errno:5 Access
 is denied.
 
C:\Program Files\MongoDB\Server\3.2\bin>

Can you please look into this ASAP, this is a bit urgent.

Thanks

Comment by Mark Agarunov [ 24/Oct/17 ]

Hello naveen.tyagi,

Thank you for the additional information. To clarify, are you seeing this error after replacing the WiredTiger.wt and WiredTiger.turtle files with those provided and running mongod with --repair? If so, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?

Thanks,
Mark

Comment by naveen Tyagi [ 24/Oct/17 ]

Hi Mark,
Getting below response on cmd.

C:\Program Files\MongoDB\Server\3.2\bin>mongod --dbpath D:/neirbi/home/mongo
2017-10-24T21:37:10.600+0530 I CONTROL [main] Hotfix KB2731284 or later update
is not installed, will zero-out data files
2017-10-24T21:37:10.620+0530 I CONTROL [initandlisten] MongoDB starting : pid=5
508 port=27017 dbpath=D:/neirbi/home/mongo 64-bit host=GIPL059
2017-10-24T21:37:10.621+0530 I CONTROL [initandlisten] targetMinOS: Windows 7/W
indows Server 2008 R2
2017-10-24T21:37:10.623+0530 I CONTROL [initandlisten] db version v3.2.17
2017-10-24T21:37:10.626+0530 I CONTROL [initandlisten] git version: 186656d7957
4f7dfe0831a7e7821292ab380f667
2017-10-24T21:37:10.633+0530 I CONTROL [initandlisten] allocator: tcmalloc
2017-10-24T21:37:10.636+0530 I CONTROL [initandlisten] modules: none
2017-10-24T21:37:10.641+0530 I CONTROL [initandlisten] build environment:
2017-10-24T21:37:10.644+0530 I CONTROL [initandlisten] distmod: 2008plus
2017-10-24T21:37:10.649+0530 I CONTROL [initandlisten] distarch: x86_64
2017-10-24T21:37:10.652+0530 I CONTROL [initandlisten] target_arch: x86_64
2017-10-24T21:37:10.655+0530 I CONTROL [initandlisten] options: { storage:

{ db Path: "D:/neirbi/home/mongo" }

}
2017-10-24T21:37:10.667+0530 I - [initandlisten] Detected data files in D
:/neirbi/home/mongo created by the 'wiredTiger' storage engine, so setting the a
ctive storage engine to 'wiredTiger'.
2017-10-24T21:37:10.679+0530 I STORAGE [initandlisten] wiredtiger_open config:
create,cache_size=2G,session_max=20000,eviction=(threads_min=4,threads_max=4),co
nfig_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,co
mpressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_s
ize=2GB),statistics_log=(wait=0),
2017-10-24T21:37:47.236+0530 E STORAGE [initandlisten] WiredTiger (-31802) [150
8861267:235568][5508:1996308192], file:collection-44--2006982009494140444.wt, WT
_SESSION.open_cursor: unable to read root page from file:collection-44--20069820
09494140444.wt: WT_ERROR: non-specific WiredTiger error
2017-10-24T21:37:47.247+0530 I - [initandlisten] Invariant failure: ret r
esulted in status UnknownError: -31802: WT_ERROR: non-specific WiredTiger error
at src\mongo\db\storage\wiredtiger\wiredtiger_session_cache.cpp 79
2017-10-24T21:37:48.486+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\util\stacktrace_windows.cpp(174) mongo::printStackTr
ace+0x43
2017-10-24T21:37:48.495+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\util\log.cpp(136) mongo::logContext+0
xa8
2017-10-24T21:37:48.506+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\util\assert_util.cpp(164) mongo::invariantOKF
ailed+0x14c
2017-10-24T21:37:48.517+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_session_cache.cpp(79) mongo::WiredTigerSe
ssion::getCursor+0xdb
2017-10-24T21:37:48.529+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_recovery_unit.cpp(269) mongo::WiredTigerCu
rsor::WiredTigerCursor+0x9a
2017-10-24T21:37:48.540+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_record_store.cpp(441) mongo::WiredTigerRe
cordStore::Cursor::Cursor+0x9b
2017-10-24T21:37:48.550+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_record_store.cpp(807) mongo::WiredTigerRe
cordStore::WiredTigerRecordStore+0x455
2017-10-24T21:37:48.561+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_kv_engine.cpp(476) mongo::WiredTigerKV
Engine::getRecordStore+0x258
2017-10-24T21:37:48.572+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\kv\kv_database_catalog_entry.cpp(270) mongo::KVDatabaseCa
talogEntry::initCollection+0x1df
2017-10-24T21:37:48.583+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\kv\kv_storage_engine.cpp(122) mongo::KVStorageEng
ine::KVStorageEngine+0x901
2017-10-24T21:37:48.594+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\storage\wiredtiger\wiredtiger_init.cpp(96) mongo::`anonymous n
amespace'::WiredTigerFactory::create+0x281
2017-10-24T21:37:48.605+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\service_context_d.cpp(148) mongo::ServiceConte
xtMongoD::initializeGlobalStorageEngine+0x88a
2017-10-24T21:37:48.615+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\db.cpp(603) mongo::`anonymous n
amespace'::_initAndListen+0x5f3
2017-10-24T21:37:48.627+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\db.cpp(798) mongo::`anonymous n
amespace'::initAndListen+0x27
2017-10-24T21:37:48.638+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\db.cpp(1033) mongoDbMain+0x216
2017-10-24T21:37:48.645+0530 I CONTROL [initandlisten] mongod.exe ...\src\mo
ngo\db\db.cpp(839) wmain+0x35
2017-10-24T21:37:48.652+0530 I CONTROL [initandlisten] mongod.exe f:\dd\vcto
ols\crt\crtw32\startup\crt0.c(255) __tmainCRTStartup+0
x144
2017-10-24T21:37:48.662+0530 I CONTROL [initandlisten] kernel32.dll
BaseThreadInitThunk
+0xd
2017-10-24T21:37:48.673+0530 I CONTROL [initandlisten]
2017-10-24T21:37:48.677+0530 I - [initandlisten]

***aborting after invariant() failure

Comment by naveen Tyagi [ 24/Oct/17 ]

Hi Mark,

Please look into the problem.

Regards,
Naveen

Comment by Mark Agarunov [ 24/Oct/17 ]

Hello naveen.tyagi,

Thank you for the report. I've attached a repair attempt of the files you've provided. Would you please extract these files and replace them in your $dbpath and let us know if it resolves the issue? If you are still seeing errors after replacing these files, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?

Thanks,
Mark

Comment by naveen Tyagi [ 24/Oct/17 ]

Can you please attend this ASAP, a bit urgent for me.

Generated at Thu Feb 08 04:27:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.