Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2014

Compact causes updates to the turtle file

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      9
    • Last comment by Customer:
      true

      Description

      I've noticed that running compact can cause an update to be propagated to the turtle file. This isn't correct. The compact is doing checkpoints of an explicit target, and we only expect the turtle file to be updated for full checkpoints.

      I encountered this issue due to a recovery problem. Recovery was getting a starting LSN from the turtle file that corresponded to a file_sync log record, instead of a checkpoint log record. That LSN is being written into the turtle file when compact is run for a single btree.

        Issue Links

          Activity

          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          My naive first change of:

          --- a/src/meta/meta_table.c
          +++ b/src/meta/meta_table.c
          @@ -132,8 +132,9 @@ __wt_metadata_update(
                      key, value, WT_META_TRACKING(session) ? "true" : "false",
                      __metadata_turtle(key) ? "" : "not "));
           
          -       if (__metadata_turtle(key))
          +       /* Only update the turtle file when doing full checkpoints. */
          +       if (__metadata_turtle(key) && session->txn.full_ckpt)
                          return (__wt_turtle_update(session, key, value));
           
                  if (WT_META_TRACKING(session))
                          WT_RET(__wt_meta_track_update(session, key));
          

          Introduces another problem:

          [1437463319:190940][13467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: live.alloc: merge range 4096-8192 overlaps with existing range 4096-8192: Invalid argument
          

          It appears as though something requires the turtle file update to happen. I've verified that the only turtle file update being skipped is the one triggered by the compact call.

          Show
          alexander.gorrod Alexander Gorrod added a comment - My naive first change of: --- a/src/meta/meta_table.c +++ b/src/meta/meta_table.c @@ -132,8 +132,9 @@ __wt_metadata_update( key, value, WT_META_TRACKING(session) ? "true" : "false", __metadata_turtle(key) ? "" : "not ")); - if (__metadata_turtle(key)) + /* Only update the turtle file when doing full checkpoints. */ + if (__metadata_turtle(key) && session->txn.full_ckpt) return (__wt_turtle_update(session, key, value)); if (WT_META_TRACKING(session)) WT_RET(__wt_meta_track_update(session, key)); Introduces another problem: [1437463319:190940][13467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: live.alloc: merge range 4096-8192 overlaps with existing range 4096-8192: Invalid argument It appears as though something requires the turtle file update to happen. I've verified that the only turtle file update being skipped is the one triggered by the compact call.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          To reproduce the behavior I'm running a modified wtperf with the following configuration file:

          conn_config="cache_size=1G,log=(enabled=true),checkpoint=(wait=10)"
          table_config="type=file"
          icount=5000000
          report_interval=5
          checkpoint_interval=5
          checkpoint_threads=1
          run_time=120
          populate_threads=1
          threads=((count=16,inserts=1,reads=1))
          

          The following modifications:

          --- a/bench/wtperf/wtperf.c
          +++ b/bench/wtperf/wtperf.c
          @@ -1211,7 +1211,7 @@ checkpoint_worker(void *arg)
                                  goto err;
                          }
                          cfg->ckpt = 1;
          -               if ((ret = session->checkpoint(session, NULL)) != 0) {
          +               if ((ret = session->compact(session, "table:test", NULL)) != 0) {
                                  lprintf(cfg, ret, 0, "Checkpoint failed.");
                                  goto err;
                          }
          

          Show
          alexander.gorrod Alexander Gorrod added a comment - To reproduce the behavior I'm running a modified wtperf with the following configuration file: conn_config="cache_size=1G,log=(enabled=true),checkpoint=(wait=10)" table_config="type=file" icount=5000000 report_interval=5 checkpoint_interval=5 checkpoint_threads=1 run_time=120 populate_threads=1 threads=((count=16,inserts=1,reads=1)) The following modifications: --- a/bench/wtperf/wtperf.c +++ b/bench/wtperf/wtperf.c @@ -1211,7 +1211,7 @@ checkpoint_worker(void *arg) goto err; } cfg->ckpt = 1; - if ((ret = session->checkpoint(session, NULL)) != 0) { + if ((ret = session->compact(session, "table:test", NULL)) != 0) { lprintf(cfg, ret, 0, "Checkpoint failed."); goto err; }
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          The better "naive" change is:

          --- a/src/txn/txn_ckpt.c
          +++ b/src/txn/txn_ckpt.c
          @@ -1068,8 +1068,9 @@ fake:     /*
                      !F_ISSET(&session->txn, WT_TXN_RUNNING)))
                          WT_ERR(__wt_checkpoint_sync(session, NULL));
           
          -       WT_ERR(__wt_meta_ckptlist_set(
          -           session, dhandle->name, ckptbase, &ckptlsn));
          +       if (session->txn.full_ckpt)
          +               WT_ERR(__wt_meta_ckptlist_set(
          +                   session, dhandle->name, ckptbase, &ckptlsn));
           
                  /*
                   * If we wrote a checkpoint (rather than faking one), pages may be
          

          It produces the same failure as mentioned above, in full:

          [1437464374:955201][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: read checksum error [4096B @ 20480, 1653346384 != 1305140450]
          [1437464374:955251][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: WiredTiger.wt: encountered an illegal file format or internal value
          [1437464374:955264][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: aborting WiredTiger library
          

          I'll chase this to the ground tomorrow.

          Show
          alexander.gorrod Alexander Gorrod added a comment - The better "naive" change is: --- a/src/txn/txn_ckpt.c +++ b/src/txn/txn_ckpt.c @@ -1068,8 +1068,9 @@ fake: /* !F_ISSET(&session->txn, WT_TXN_RUNNING))) WT_ERR(__wt_checkpoint_sync(session, NULL)); - WT_ERR(__wt_meta_ckptlist_set( - session, dhandle->name, ckptbase, &ckptlsn)); + if (session->txn.full_ckpt) + WT_ERR(__wt_meta_ckptlist_set( + session, dhandle->name, ckptbase, &ckptlsn)); /* * If we wrote a checkpoint (rather than faking one), pages may be It produces the same failure as mentioned above, in full: [1437464374:955201][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: read checksum error [4096B @ 20480, 1653346384 != 1305140450] [1437464374:955251][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: WiredTiger.wt: encountered an illegal file format or internal value [1437464374:955264][24467:0x7ffff55a8700], file:WiredTiger.wt, WT_SESSION.checkpoint: aborting WiredTiger library I'll chase this to the ground tomorrow.
          Hide
          sue.loverso Sue LoVerso added a comment -

          Alexander Gorrod There were a number of recent changes in mid-late May relating to WT-1936 and WT-1944. That might be a place to start.

          Show
          sue.loverso Sue LoVerso added a comment - Alexander Gorrod There were a number of recent changes in mid-late May relating to WT-1936 and WT-1944 . That might be a place to start.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          Thanks for the pointer Sue Loverso, turns out this was a problem that has been present for ever. The changes in https://github.com/wiredtiger/wiredtiger/pull/2077 have fixed the problems I was reproducing.

          Show
          alexander.gorrod Alexander Gorrod added a comment - Thanks for the pointer Sue Loverso , turns out this was a problem that has been present for ever. The changes in https://github.com/wiredtiger/wiredtiger/pull/2077 have fixed the problems I was reproducing.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: WT-2014 Don't update the turtle file for per-file checkpoints.

          If logging is enabled, updating the turtle file alters where recovery
          starts from, which we only want when doing full checkpoints. Uncovered
          during testing with compact - which does targeted checkpoints.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/2783a63be3167046d4875a6cdaa391a2aa7902c8

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: WT-2014 Don't update the turtle file for per-file checkpoints. If logging is enabled, updating the turtle file alters where recovery starts from, which we only want when doing full checkpoints. Uncovered during testing with compact - which does targeted checkpoints. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/2783a63be3167046d4875a6cdaa391a2aa7902c8
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: Ensure the metadata is flushed on shutdown.

          Otherwise the work we do to write checkpoints in all open files could cause
          the metadata to be out of sync if there is a system crash at the end of
          connection close.

          Found by inspection whilst investigating WT-2014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/3e89b2828341bf67ce9dc5df6a24d33e95d54118

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: Ensure the metadata is flushed on shutdown. Otherwise the work we do to write checkpoints in all open files could cause the metadata to be out of sync if there is a system crash at the end of connection close. Found by inspection whilst investigating WT-2014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/3e89b2828341bf67ce9dc5df6a24d33e95d54118
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2077 from wiredtiger/checkpoint-file-fixes

          WT-2014 Checkpoint file fixes
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/c8633e610943865fa81fcd1e8242c6c9a80435c4

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2077 from wiredtiger/checkpoint-file-fixes WT-2014 Checkpoint file fixes Branch: develop https://github.com/wiredtiger/wiredtiger/commit/c8633e610943865fa81fcd1e8242c6c9a80435c4
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2077 from wiredtiger/checkpoint-file-fixes

          WT-2014 Checkpoint file fixes
          (cherry picked from commit c8633e610943865fa81fcd1e8242c6c9a80435c4)
          Branch: mongodb-3.0
          https://github.com/wiredtiger/wiredtiger/commit/f5147367a5a8a109047eb01b26f6154fb10f9596

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2077 from wiredtiger/checkpoint-file-fixes WT-2014 Checkpoint file fixes (cherry picked from commit c8633e610943865fa81fcd1e8242c6c9a80435c4) Branch: mongodb-3.0 https://github.com/wiredtiger/wiredtiger/commit/f5147367a5a8a109047eb01b26f6154fb10f9596

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 42 weeks, 4 days ago
                Date of 1st Reply: