Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-51041

Throttle starting transactions for secondary reads

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.2.9, 4.4.1, 4.7.0
    • Fix Version/s: 4.2.10, 4.8.0, 4.4.2
    • Component/s: None
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4, v4.2
    • Sprint:
      Execution Team 2020-10-05
    • Case:
    • Linked BF Score:
      0

      Description

      This performance regression affects readConcern "local" and "available" reads on secondary nodes.

      SERVER-46721 removed a mutex around a critical section that effectively synchronized every external secondary reader that reads at lastApplied. I deemed this mutex unnecessary, but removing it pushed a synchronization problem down to a lower level.

      For high volumes of short-lived secondary reads, it appears as though the WT reader-writer lock for the global read timestamp queue does not handle excessive contention as well as the mutex before it.

      The problem I see is that the WT read timestamp queue leaves around old entries from inactive transactions. New readers (holding write locks on the read timestamp queue) are responsible for cleaning up old entries even if the queue has hundreds of thousands of inactive entries. This then blocks out other readers, which spin wait for a moment, then start context switching wildly. Once the queue shrinks down, thousands of new read requests come in, but the problem just repeats itself. This leads to very unpredicatable latencies and poor CPU utilization.

      I was able to fix the performance problem by re-introducing a mutex around the area where we start transactions for secondary reads (at lastApplied):

      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
      index 3f07c244c5..bcdef7f70c 100644
      --- a/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
      +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
      @@ -590,6 +590,7 @@ Timestamp WiredTigerRecoveryUnit::_beginTransactionAtAllDurableTimestamp(WT_SESS
           return readTimestamp;
       }
       
      +Mutex _lastAppliedTxnMutex = MONGO_MAKE_LATCH("_lastAppliedTxnMutex");
       Timestamp WiredTigerRecoveryUnit::_beginTransactionAtLastAppliedTimestamp(WT_SESSION* session) {
           auto lastApplied = _sessionCache->snapshotManager().getLastApplied();
           if (!lastApplied) {
      @@ -609,6 +610,8 @@ Timestamp WiredTigerRecoveryUnit::_beginTransactionAtLastAppliedTimestamp(WT_SES
               return Timestamp();
           }
       
      +
      +    stdx::lock_guard<Latch> lock(_lastAppliedTxnMutex);
           WiredTigerBeginTxnBlock txnOpen(session,
                                           _prepareConflictBehavior,
                                           _roundUpPreparedTimestamps,
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              louis.williams Louis Williams
              Reporter:
              louis.williams Louis Williams
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: