[SERVER-30544] fsynclock results in corrupt disk snapshot Created: 07/Aug/17  Updated: 09/Oct/17  Resolved: 15/Sep/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dharshan Rangegowda Assignee: Mark Agarunov
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive fail.zip     Zip Archive success.zip    
Operating System: ALL
Participants:

 Description   

I am using server 3.2.12/3.4.3. FsyncLock() commands appear to corrupt the disk data such that a snapshot taken at this time does not work. The data files exist but wiredtiger ignores them - so it might be some issue with the on disk catalog

I am using latest version of mongodb java driver so it is unrelated to SERVER-28876,JAVA-2501.

I am able to reproduce the issue with reasonable consistently (doesn't happen every time). Here are my steps
1. Start with a sample data
2. db.fsyncLock()
3. Snapshot the disk
4. db.fsyncUnLock()

The lock and unlock steps appear to work and trace in the log

I have attached two files
1. Succes.zip - successful snapshot
2. fail.zip - Failed snapshot

The database "test.test" does not show up in fail.zip. The underlying file fpr this collection is collection-18--3756087474573173486.wt which is present in both the snapshots

Using the wt command line tool directly confirms that wiredtiger does not see this table even though the file exists
[root@SG-az34rs1-1097 wiredtiger-2.7.0]# ./wt -v -h ..<path> -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" list
colgroup:_mdb_catalog
colgroup:collection-0--2721376547481716549
colgroup:collection-2--2721376547481716549
colgroup:index-1--2721376547481716549
colgroup:index-3--2721376547481716549
colgroup:sizeStorer
file:_mdb_catalog.wt
file:collection-0--2721376547481716549.wt
file:collection-2--2721376547481716549.wt
file:index-1--2721376547481716549.wt
file:index-3--2721376547481716549.wt
file:sizeStorer.wt
table:_mdb_catalog
table:collection-0--2721376547481716549
table:collection-2--2721376547481716549
table:index-1--2721376547481716549
table:index-3--2721376547481716549
table:sizeStorer



 Comments   
Comment by Mark Agarunov [ 15/Sep/17 ]

Hello dharshanr@scalegrid.net,

Thank you for the detailed report. Unfortunately I have not been able to reproduce this behavior following the steps you outlined. If more information comes to light, please let me know and we can reopen this ticket.

Thanks,
Mark

Generated at Thu Feb 08 04:24:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.