Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4058

Make slot switch quicker when io is slow

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: WT2.9.2, WT2.9.3, WT3.0.0
    • Fix Version/s: 3.6.6, 4.0.0-rc0, WT3.1.0
    • Component/s: None
    • Labels:
    • Sprint:
      Storage Non-NYC 2018-05-07

      Description

      Hi,

      I have been investigating a MongoDB performance issue for weeks recently. I am using WiredTiger 2.9.2. I noticed that when io is slow, for example, imposing a cgroup limitation on block iops, the log slot join will have much higher failure rates. That's because some write threads were waiting for too long on the spin lock of slot switch. And the slot switch spin lock was occupied because a force slot switch is doing the log release process, which calls a __log_fs_write(). Because the io was slow, so the lock time was very long.
      I wonder if we can release the log slot lock before __wt_write() and reacquire it after, since slots are not overlapped, and concurrent pwrite to a log file should not be a problem.
      I have attached my patch as follows, look forward for your reply! Thanks!

      --- a/src/third_party/wiredtiger/src/log/log.c
      +++ b/src/third_party/wiredtiger/src/log/log.c
      @@ -64,7 +64,12 @@ static int
       __log_fs_write(WT_SESSION_IMPL *session,
           WT_LOGSLOT *slot, wt_off_t offset, size_t len, const void *buf)
       {
      + WT_CONNECTION_IMPL *conn;
        WT_DECL_RET;
      + WT_LOG *log;
      +
      + conn = S2C(session);
      + log = conn->log;
       
        /*
         * If we're writing into a new log file, we have to wait for all
      @@ -75,9 +80,14 @@ __log_fs_write(WT_SESSION_IMPL *session,
          WT_RET(__log_wait_for_earlier_slot(session, slot));
          WT_RET(__wt_log_force_sync(session, &slot->slot_release_lsn));
        }
      + if (F_ISSET(session, WT_SESSION_LOCKED_SLOT))
      +   __wt_spin_unlock(session, &log->log_slot_lock);
        if ((ret = __wt_write(session, slot->slot_fh, offset, len, buf)) != 0)
          WT_PANIC_MSG(session, ret,
              "%s: fatal log failure", slot->slot_fh->name);
      + if (F_ISSET(session, WT_SESSION_LOCKED_SLOT))
      +   __wt_spin_lock(session, &log->log_slot_lock);
      +
        return (ret);
       }
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              sue.loverso Susan LoVerso
              Reporter:
              zhcn381 CenZheng
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: