Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4058

Make slot switch quicker when io is slow

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • WT2.9.2, WT2.9.3, WT3.0.0
    • 3.6.6, 4.0.0-rc0, WT3.1.0
    • None
    • Storage Non-NYC 2018-05-07

    Description

      Hi,

      I have been investigating a MongoDB performance issue for weeks recently. I am using WiredTiger 2.9.2. I noticed that when io is slow, for example, imposing a cgroup limitation on block iops, the log slot join will have much higher failure rates. That's because some write threads were waiting for too long on the spin lock of slot switch. And the slot switch spin lock was occupied because a force slot switch is doing the log release process, which calls a __log_fs_write(). Because the io was slow, so the lock time was very long.
      I wonder if we can release the log slot lock before __wt_write() and reacquire it after, since slots are not overlapped, and concurrent pwrite to a log file should not be a problem.
      I have attached my patch as follows, look forward for your reply! Thanks!

      --- a/src/third_party/wiredtiger/src/log/log.c
      +++ b/src/third_party/wiredtiger/src/log/log.c
      @@ -64,7 +64,12 @@ static int
       __log_fs_write(WT_SESSION_IMPL *session,
           WT_LOGSLOT *slot, wt_off_t offset, size_t len, const void *buf)
       {
      + WT_CONNECTION_IMPL *conn;
        WT_DECL_RET;
      + WT_LOG *log;
      +
      + conn = S2C(session);
      + log = conn->log;
       
        /*
         * If we're writing into a new log file, we have to wait for all
      @@ -75,9 +80,14 @@ __log_fs_write(WT_SESSION_IMPL *session,
          WT_RET(__log_wait_for_earlier_slot(session, slot));
          WT_RET(__wt_log_force_sync(session, &slot->slot_release_lsn));
        }
      + if (F_ISSET(session, WT_SESSION_LOCKED_SLOT))
      +   __wt_spin_unlock(session, &log->log_slot_lock);
        if ((ret = __wt_write(session, slot->slot_fh, offset, len, buf)) != 0)
          WT_PANIC_MSG(session, ret,
              "%s: fatal log failure", slot->slot_fh->name);
      + if (F_ISSET(session, WT_SESSION_LOCKED_SLOT))
      +   __wt_spin_lock(session, &log->log_slot_lock);
      +
        return (ret);
       }
      

      Attachments

        Issue Links

          Activity

            People

              sue.loverso@mongodb.com Susan LoVerso
              zhcn381 CenZheng
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: