Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30544

fsynclock results in corrupt disk snapshot

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • ALL

      I am using server 3.2.12/3.4.3. FsyncLock() commands appear to corrupt the disk data such that a snapshot taken at this time does not work. The data files exist but wiredtiger ignores them - so it might be some issue with the on disk catalog

      I am using latest version of mongodb java driver so it is unrelated to SERVER-28876,JAVA-2501.

      I am able to reproduce the issue with reasonable consistently (doesn't happen every time). Here are my steps
      1. Start with a sample data
      2. db.fsyncLock()
      3. Snapshot the disk
      4. db.fsyncUnLock()

      The lock and unlock steps appear to work and trace in the log

      I have attached two files
      1. Succes.zip - successful snapshot
      2. fail.zip - Failed snapshot

      The database "test.test" does not show up in fail.zip. The underlying file fpr this collection is collection-18--3756087474573173486.wt which is present in both the snapshots

      Using the wt command line tool directly confirms that wiredtiger does not see this table even though the file exists
      [root@SG-az34rs1-1097 wiredtiger-2.7.0]# ./wt -v -h ..<path> -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" list
      colgroup:_mdb_catalog
      colgroup:collection-0--2721376547481716549
      colgroup:collection-2--2721376547481716549
      colgroup:index-1--2721376547481716549
      colgroup:index-3--2721376547481716549
      colgroup:sizeStorer
      file:_mdb_catalog.wt
      file:collection-0--2721376547481716549.wt
      file:collection-2--2721376547481716549.wt
      file:index-1--2721376547481716549.wt
      file:index-3--2721376547481716549.wt
      file:sizeStorer.wt
      table:_mdb_catalog
      table:collection-0--2721376547481716549
      table:collection-2--2721376547481716549
      table:index-1--2721376547481716549
      table:index-3--2721376547481716549
      table:sizeStorer

        1. fail.zip
          3.06 MB
        2. success.zip
          1.88 MB

            Assignee:
            mark.agarunov Mark Agarunov
            Reporter:
            dharshanr@scalegrid.net Dharshan Rangegowda
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: