[SERVER-16804] mongod --repair fails because verify() returns EBUSY under WiredTiger Created: 12/Jan/15  Updated: 14/Apr/15  Resolved: 27/Mar/15

Status: Closed
Project: Core Server
Component/s: Storage, WiredTiger
Affects Version/s: 2.8.0-rc4
Fix Version/s: 3.0.2, 3.1.1

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File db.tgz    
Issue Links:
Duplicate
is duplicated by SERVER-16869 WT verify can fail with EBUSY Closed
Related
is related to SERVER-16457 WT verify and salvage operations fail... Closed
is related to SERVER-17767 Remove the code that ignores EBUSY re... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   

Code comment indicates this was due to occasional failures tracked in SERVER-16457, but this failure occurs every time with the attached db, and that ticket has been closed as fixed as of rc4. Corrupted db attached: 4 KB at offset 0x001ba000, which is a leaf node, have been overwritten with 0s.

2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] MongoDB starting : pid=7608 port=27017 dbpath=/Users/bdlucas/db/db/r0 64-bit host=reboot.local
2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] 
2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] ** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000
2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] db version v2.8.0-rc4
2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] git version: 3ad571742911f04b307f0071979425511c4f2570
2015-01-12T07:59:45.956-0500 I CONTROL  [initandlisten] build info: Darwin mci-osx108-7.build.10gen.cc 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_49
2015-01-12T07:59:45.966-0500 I CONTROL  [initandlisten] allocator: system
2015-01-12T07:59:45.966-0500 I CONTROL  [initandlisten] options: { repair: true, storage: { dbPath: "/Users/bdlucas/db/db/r0", engine: "wiredtiger" } }
2015-01-12T07:59:45.966-0500 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,extensions=[local=(entry=index_collator_extension)],statistics=(fast),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2015-01-12T07:59:46.743-0500 I STORAGE  [initandlisten] Repairing size cache
2015-01-12T07:59:47.085-0500 I STORAGE  [initandlisten] WiredTiger progress session.verify 2
2015-01-12T07:59:47.085-0500 I STORAGE  [initandlisten] Verify succeeded on uri table:sizeStorer. Not salvaging.
2015-01-12T07:59:47.085-0500 I STORAGE  [initandlisten] Repairing catalog metadata
2015-01-12T07:59:47.086-0500 I STORAGE  [initandlisten] WiredTiger progress session.verify 2
2015-01-12T07:59:47.086-0500 I STORAGE  [initandlisten] Verify succeeded on uri table:_mdb_catalog. Not salvaging.
2015-01-12T07:59:47.094-0500 I STORAGE  [initandlisten] repairDatabase local
2015-01-12T07:59:47.094-0500 I STORAGE  [initandlisten] Repairing collection local.startup_log
2015-01-12T07:59:47.094-0500 E STORAGE  [initandlisten] Verify on table:collection-0--7301267925547912616 failed with EBUSY. Assuming no salvage is needed.
2015-01-12T07:59:47.101-0500 I INDEX    [initandlisten] build index on: local.startup_log properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.startup_log" }
2015-01-12T07:59:47.101-0500 I INDEX    [initandlisten] 	 building index using bulk method
2015-01-12T07:59:47.195-0500 I STORAGE  [initandlisten] repairDatabase test
2015-01-12T07:59:47.195-0500 I STORAGE  [initandlisten] Repairing collection test.c
2015-01-12T07:59:47.195-0500 E STORAGE  [initandlisten] Verify on table:collection-2--7301267925547912616 failed with EBUSY. Assuming no salvage is needed.
2015-01-12T07:59:47.199-0500 I INDEX    [initandlisten] build index on: test.c properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "test.c" }
2015-01-12T07:59:47.199-0500 I INDEX    [initandlisten] 	 building index using bulk method
2015-01-12T07:59:47.203-0500 I INDEX    [initandlisten] build index on: test.c properties: { v: 1, key: { i: undefined }, name: "i_undefined", ns: "test.c" }
2015-01-12T07:59:47.203-0500 I INDEX    [initandlisten] 	 building index using bulk method
2015-01-12T07:59:47.204-0500 E STORAGE  [initandlisten] WiredTiger (0) [1421067587:204531][7608:0x7fff7db4c310], file:collection-2--7301267925547912616.wt, cursor.next: read checksum error [4096B @ 1810432, 164825738 != 0]
2015-01-12T07:59:47.204-0500 E STORAGE  [initandlisten] WiredTiger (0) [1421067587:204587][7608:0x7fff7db4c310], file:collection-2--7301267925547912616.wt, cursor.next: collection-2--7301267925547912616.wt: encountered an illegal file format or internal value
2015-01-12T07:59:47.204-0500 E STORAGE  [initandlisten] WiredTiger (-31804) [1421067587:204630][7608:0x7fff7db4c310], file:collection-2--7301267925547912616.wt, cursor.next: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-01-12T07:59:47.204-0500 I -        [initandlisten] Fatal Assertion 28558



 Comments   
Comment by Michael Cahill (Inactive) [ 27/Mar/15 ]

Note for 3.0.2 / 3.1.1: the original issue is resolved. The remaining work in this ticket is to remove workaround code.

Comment by David Hows [ 27/Mar/15 ]

I've reverted the changes in Mongo around EBUSY and merged in Develop.

Confirmed that repair functions as expected.

Comment by Michael Cahill (Inactive) [ 18/Mar/15 ]

Reopening: once the fix from WT is merged into MongoDB, we should revert the change that ignores EBUSY returns from verify in the integration layer so that salvage is run when required.

Comment by Michael Cahill (Inactive) [ 18/Mar/15 ]

This was a case of https://github.com/wiredtiger/wiredtiger/issues/1404 (fixed today in WT develop).

Generated at Thu Feb 08 03:42:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.