[SERVER-21986] Failed to repair mongo after system crash Created: 22/Dec/15  Updated: 10/Feb/16  Resolved: 10/Feb/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Timur Mamezhanov [X] Assignee: Kelsey Schubert
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.turtle.new     File WiredTiger.wt     File WiredTiger.wt.new     File sizeStorer.wt    
Issue Links:
Duplicate
duplicates SERVER-18316 Database with WT engine fails to reco... Closed
Operating System: ALL
Participants:

 Description   

The system was crushed, i could only return to virtual snapchot and get DB files

but when i insert it and start mong with --repair key , i got read checksum error

c:\Program Files\MongoDB\bin>.\mongod --dbpath "c:\Program Files\MongoDB\Data" --repair --storageEngine wiredTiger
2015-12-22T14:50:54.029+0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=3G,session_max=20000,e
viction=(threads_max=4),statistics=(fast),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2015-12-22T14:50:54.036+0400 E STORAGE  [initandlisten] WiredTiger (0) [1450781454:36877][8020:140713109689168], file:Wi
redTiger.wt, connection: read checksum error [24576B @ 421888, 996688094 != 1475298810]
2015-12-22T14:50:54.040+0400 E STORAGE  [initandlisten] WiredTiger (0) [1450781454:39879][8020:140713109689168], file:Wi
redTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2015-12-22T14:50:54.041+0400 E STORAGE  [initandlisten] WiredTiger (-31804) [1450781454:41865][8020:140713109689168], fi
le:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-12-22T14:50:54.042+0400 I -        [initandlisten] Fatal Assertion 28558
2015-12-22T14:50:55.215+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\stacktrace_win.cpp(175)
                 mongo::printStackTrace+0x43
2015-12-22T14:50:55.217+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\log.cpp(135)
                 mongo::logContext+0x97
2015-12-22T14:50:55.219+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\assert_util.cpp(165)
                 mongo::fassertFailed+0x80
2015-12-22T14:50:55.221+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_uti
l.cpp(297)       mongo::`anonymous namespace'::mdb_handle_error+0xb5
2015-12-22T14:50:55.223+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(2
91)              __wt_eventv+0x3fc
2015-12-22T14:50:55.225+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(3
16)              __wt_err+0x32
2015-12-22T14:50:55.226+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(5
14)              __wt_illegal_value+0x5e
2015-12-22T14:50:55.228+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\block\block_rea
d.c(214)         __wt_block_read_off+0x177
2015-12-22T14:50:55.230+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\block\block_rea
d.c(47)          __wt_bm_preload+0xe3
2015-12-22T14:50:55.232+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\btree\bt_handle
.c(540)          __btree_preload+0x9f
2015-12-22T14:50:55.233+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\btree\bt_handle
.c(110)          __wt_btree_open+0x22f
2015-12-22T14:50:55.235+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_dhand
le.c(454)        __conn_btree_open+0x7b
2015-12-22T14:50:55.237+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_dhand
le.c(506)        __wt_conn_btree_get+0xcf
2015-12-22T14:50:55.239+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\session\session
_dhandle.c(408)  __wt_session_get_btree+0x3e2
2015-12-22T14:50:55.240+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\meta\meta_table
.c(43)           __wt_metadata_open+0x35
2015-12-22T14:50:55.242+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_api.c
(1775)           wiredtiger_open+0x6ba
2015-12-22T14:50:55.244+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_kv_
engine.cpp(117)  mongo::WiredTigerKVEngine::WiredTigerKVEngine+0x4e3
2015-12-22T14:50:55.246+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_ini
t.cpp(66)        mongo::`anonymous namespace'::WiredTigerFactory::create+0x9f
2015-12-22T14:50:55.248+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\global_environment_d.cpp(107)
                 mongo::GlobalEnvironmentMongoD::setGlobalStorageEngine+0x40f
2015-12-22T14:50:55.249+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(489)
                 mongo::_initAndListen+0x488
2015-12-22T14:50:55.251+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(651)
                 mongo::initAndListen+0x27
2015-12-22T14:50:55.253+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(896)
                 mongoDbMain+0x206
2015-12-22T14:50:55.254+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(694)
                 wmain+0x35
2015-12-22T14:50:55.256+0400 I CONTROL  [initandlisten] mongod.exe    f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c(623)
                 __tmainCRTStartup+0x10f
2015-12-22T14:50:55.258+0400 I CONTROL  [initandlisten] KERNEL32.DLL
                 BaseThreadInitThunk+0xd
2015-12-22T14:50:55.259+0400 I CONTROL  [initandlisten]
2015-12-22T14:50:55.260+0400 I -        [initandlisten]
 
***aborting after fassert() failure

As i see, it is possible to recover data (some fixed WiredTiger.wt and sizeStorer.wt help to repair)
How can i correct these files?



 Comments   
Comment by Kelsey Schubert [ 29/Dec/15 ]

Hi Unit22,

Are you able resync from another replicaset member?

So we can continue to investigate this:

  1. Can you provide more information about the read error regarding journal/WiredTigerLog.0000000067?
  2. Can you upload a tar of the original database that showed these problems? I have created a link to upload these files here. Please include the journal directory in the tar as well.

Thank you,
Thomas

Comment by Timur Mamezhanov [X] [ 28/Dec/15 ]

Hello Thomas, after i've made specifeid actions (upgrade to 3.0.8, copy corrupted Data directory, insert 2 renamed files), simple service start failed with an error 100
run with --repair:

journal/WiredTigerLog.0000000067 read error

i've tried to do something and decided to delete data/journal directory
after that, run with --repair started, fixed some objects and failed to continue:

2015-12-28T08:51:17.769+0400 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2015-12-28T08:51:17.770+0400 I CONTROL  [initandlisten] db version v3.0.8
2015-12-28T08:51:17.772+0400 I CONTROL  [initandlisten] git version: 83d8cc25e00e42856924d84e220fbe4a839e605d
2015-12-28T08:51:17.773+0400 I CONTROL  [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, buil
d=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
2015-12-28T08:51:17.775+0400 I CONTROL  [initandlisten] allocator: tcmalloc
2015-12-28T08:51:17.776+0400 I CONTROL  [initandlisten] options: { repair: true, storage: { dbPath: "c:\Program Files\Mo
ngoDB\Data", engine: "wiredTiger" } }
2015-12-28T08:51:17.779+0400 W -        [initandlisten] Detected unclean shutdown - c:\Program Files\MongoDB\Data\mongod
.lock is not empty.
2015-12-28T08:51:17.780+0400 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2015-12-28T08:51:17.781+0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=3G,session_max=20000,e
viction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(c
lose_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),,log=(enabled=false),
2015-12-28T08:51:17.810+0400 I STORAGE  [initandlisten] Repairing size cache
2015-12-28T08:51:17.812+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278277:812814][6788:140713109689168], file:s
izeStorer.wt, session.verify: read checksum error for 4096B block at offset 40960: block header checksum of 3712394300 d
oesn't match expected checksum of 4207151295
2015-12-28T08:51:17.816+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278277:816813][6788:140713109689168], file:s
izeStorer.wt, session.verify: checkpoint ranges never verified: 1
2015-12-28T08:51:17.817+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278277:817814][6788:140713109689168], file:s
izeStorer.wt, session.verify: file ranges never verified: 1
2015-12-28T08:51:17.818+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 0
2015-12-28T08:51:17.819+0400 I STORAGE  [initandlisten] Verify failed on uri table:sizeStorer. Running a salvage operati
on.
2015-12-28T08:51:17.822+0400 I STORAGE  [initandlisten] WiredTiger progress session.salvage 8
2015-12-28T08:51:17.825+0400 I STORAGE  [initandlisten] Repairing catalog metadata
2015-12-28T08:51:17.828+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 4
2015-12-28T08:51:17.829+0400 I STORAGE  [initandlisten] Verify succeeded on uri table:_mdb_catalog. Not salvaging.
2015-12-28T08:51:17.987+0400 I STORAGE  [initandlisten] repairDatabase local
2015-12-28T08:51:17.988+0400 I STORAGE  [initandlisten] Repairing collection local.startup_log
2015-12-28T08:51:17.991+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 2
2015-12-28T08:51:17.991+0400 I STORAGE  [initandlisten] Verify succeeded on uri table:collection-0-1064962034925078596.
Not salvaging.
2015-12-28T08:51:18.683+0400 I INDEX    [initandlisten] build index on: local.startup_log properties: { v: 1, key: { _id
: 1 }, name: "_id_", ns: "local.startup_log" }
2015-12-28T08:51:18.685+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:18.745+0400 I STORAGE  [initandlisten] repairDatabase smartcat
2015-12-28T08:51:18.746+0400 I STORAGE  [initandlisten] Repairing collection smartcat.AccountNotificationSettings
2015-12-28T08:51:18.748+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 2
2015-12-28T08:51:18.749+0400 I STORAGE  [initandlisten] Verify succeeded on uri table:collection-156-1064962034925078596
. Not salvaging.
2015-12-28T08:51:18.873+0400 I INDEX    [initandlisten] build index on: smartcat.AccountNotificationSettings properties:
 { v: 1, key: { _id: 1 }, name: "_id_", ns: "smartcat.AccountNotificationSettings" }
2015-12-28T08:51:18.875+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:18.912+0400 I INDEX    [initandlisten] build index on: smartcat.AccountNotificationSettings properties:
 { v: 1, key: { accountId: 1.0, notificationType: 1.0 }, name: "IX_AccountId_NotificationType", ns: "smartcat.AccountNot
ificationSettings" }
2015-12-28T08:51:18.915+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:19.227+0400 I STORAGE  [initandlisten] Repairing collection smartcat.AsyncTasks.AddNewLanguage
2015-12-28T08:51:19.229+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 0
2015-12-28T08:51:19.230+0400 I STORAGE  [initandlisten] Verify succeeded on uri table:collection-390-1064962034925078596
. Not salvaging.
2015-12-28T08:51:19.615+0400 I INDEX    [initandlisten] build index on: smartcat.AsyncTasks.AddNewLanguage properties: {
 v: 1, key: { _id: 1 }, name: "_id_", ns: "smartcat.AsyncTasks.AddNewLanguage" }
2015-12-28T08:51:19.617+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:20.011+0400 I INDEX    [initandlisten] build index on: smartcat.AsyncTasks.AddNewLanguage properties: {
 v: 1, key: { state: 1, upTime: 1 }, name: "IX_State_UpTime", ns: "smartcat.AsyncTasks.AddNewLanguage" }
2015-12-28T08:51:20.013+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:20.295+0400 I STORAGE  [initandlisten] Repairing collection smartcat.AsyncTasks.CompleteDocumentWorkflo
wStage
2015-12-28T08:51:20.298+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278280:298214][6788:140713109689168], file:c
ollection-130-1064962034925078596.wt, session.verify: read checksum error for 4096B block at offset 28672: block header
checksum of 789578147 doesn't match expected checksum of 1538608562
2015-12-28T08:51:20.300+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278280:300212][6788:140713109689168], file:c
ollection-130-1064962034925078596.wt, session.verify: checkpoint ranges never verified: 2
2015-12-28T08:51:20.302+0400 E STORAGE  [initandlisten] WiredTiger (0) [1451278280:302212][6788:140713109689168], file:c
ollection-130-1064962034925078596.wt, session.verify: file ranges never verified: 1
2015-12-28T08:51:20.303+0400 I STORAGE  [initandlisten] WiredTiger progress session.verify 0
2015-12-28T08:51:20.304+0400 I STORAGE  [initandlisten] Verify failed on uri table:collection-130-1064962034925078596. R
unning a salvage operation.
2015-12-28T08:51:20.307+0400 I STORAGE  [initandlisten] WiredTiger progress session.salvage 9
2015-12-28T08:51:20.356+0400 I INDEX    [initandlisten] build index on: smartcat.AsyncTasks.CompleteDocumentWorkflowStag
e properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "smartcat.AsyncTasks.CompleteDocumentWorkflowStage" }
2015-12-28T08:51:20.359+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:20.436+0400 I INDEX    [initandlisten] build index on: smartcat.AsyncTasks.CompleteDocumentWorkflowStag
e properties: { v: 1, key: { state: 1.0, upTime: 1.0 }, name: "IX_State_UpTime", ns: "smartcat.AsyncTasks.CompleteDocume
ntWorkflowStage" }
2015-12-28T08:51:20.438+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:20.695+0400 I INDEX    [initandlisten] build index on: smartcat.AsyncTasks.CompleteDocumentWorkflowStag
e properties: { v: 1, unique: true, key: { documentId: 1.0, languageId: 1.0, stageNumber: 1.0 }, name: "IX_DocumentId_La
nguageId_StageNumber", ns: "smartcat.AsyncTasks.CompleteDocumentWorkflowStage" }
2015-12-28T08:51:20.697+0400 I INDEX    [initandlisten]          building index using bulk method
2015-12-28T08:51:21.451+0400 I STORAGE  [initandlisten] Repairing collection smartcat.AsyncTasks.DisassembleSourceFiles
2015-12-28T08:51:21.452+0400 I STORAGE  [initandlisten] Verify failed on uri table:collection-88-1064962034925078596. Ru
nning a salvage operation.
2015-12-28T08:51:21.456+0400 I -        [initandlisten] Invariant failure rs.get() src\mongo\db\catalog\database.cpp 176
 
2015-12-28T08:51:22.668+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\stacktrace_win.cpp(175)
 mongo::printStackTrace+0x43
2015-12-28T08:51:22.670+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\log.cpp(134)
 mongo::logContext+0x97
2015-12-28T08:51:22.672+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\assert_util.cpp(146)
 mongo::invariantFailed+0xf0
2015-12-28T08:51:22.673+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\catalog\database.cpp(176)
 mongo::Database::_getOrCreateCollectionInstance+0xa4
2015-12-28T08:51:22.675+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\catalog\database.cpp(200)
 mongo::Database::Database+0x22a
2015-12-28T08:51:22.677+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\catalog\database_holder.cpp(135)
 mongo::DatabaseHolder::openDb+0x2c0
2015-12-28T08:51:22.679+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\repair_database.cpp(245)
 mongo::repairDatabase+0x66f
2015-12-28T08:51:22.680+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(380)
 mongo::repairDatabasesAndCheckVersion+0x25f
2015-12-28T08:51:22.682+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(570)
 mongo::_initAndListen+0xc23
2015-12-28T08:51:22.684+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(669)
 mongo::initAndListen+0x27
2015-12-28T08:51:22.685+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(907)
 mongoDbMain+0x209
2015-12-28T08:51:22.687+0400 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(708)
 wmain+0x35
2015-12-28T08:51:22.688+0400 I CONTROL  [initandlisten] mongod.exe    f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c(623)
 __tmainCRTStartup+0x10f
2015-12-28T08:51:22.690+0400 I CONTROL  [initandlisten] KERNEL32.DLL

Comment by Kelsey Schubert [ 24/Dec/15 ]

Hi Unit22,

This is a duplicate of SERVER-18316, please upgrade to 3.0.8.

I've uploaded the two files after the repair attempt with the .new suffix. Can you please put them in your dbpath without the suffix and with the right permissions, and try to start your mongod instance after upgrading?

Kind regards,
Thomas

Comment by Timur Mamezhanov [X] [ 22/Dec/15 ]

1. As I understand, the version is 3.0.3 (at least on last working snapshot)
2. The OS on virual server (Server 2012) is accidentally refused to boot with blue screen, no tries to bring it up were succesful

Comment by Kelsey Schubert [ 22/Dec/15 ]

Hi Unit22,

Thank you for the report. To continue to investigate can you please answer the following questions:

  1. Can you please specify which version of MongoDB you are using?
  2. Can you provide more details about the cause system crash?

Thank you,
Thomas

Generated at Thu Feb 08 03:59:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.