Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.8.0, 4.2.10, 4.4.2
Affects Version/s: 4.2.9, 4.4.1, 4.7.0
Component/s: None
Labels:
- KP44

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4, v4.2
Sprint:
Execution Team 2020-10-05
Case:
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

This performance regression affects readConcern "local" and "available" reads on secondary nodes.

~~SERVER-46721~~ removed a mutex around a critical section that effectively synchronized every external secondary reader that reads at lastApplied. I deemed this mutex unnecessary, but removing it pushed a synchronization problem down to a lower level.

For high volumes of short-lived secondary reads, it appears as though the WT reader-writer lock for the global read timestamp queue does not handle excessive contention as well as the mutex before it.

The problem I see is that the WT read timestamp queue leaves around old entries from inactive transactions. New readers (holding write locks on the read timestamp queue) are responsible for cleaning up old entries even if the queue has hundreds of thousands of inactive entries. This then blocks out other readers, which spin wait for a moment, then start context switching wildly. Once the queue shrinks down, thousands of new read requests come in, but the problem just repeats itself. This leads to very unpredicatable latencies and poor CPU utilization.

I was able to fix the performance problem by re-introducing a mutex around the area where we start transactions for secondary reads (at lastApplied):

Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
index 3f07c244c5..bcdef7f70c 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
@@ -590,6 +590,7 @@ Timestamp WiredTigerRecoveryUnit::_beginTransactionAtAllDurableTimestamp(WT_SESS
     return readTimestamp;
 }
 
+Mutex _lastAppliedTxnMutex = MONGO_MAKE_LATCH("_lastAppliedTxnMutex");
 Timestamp WiredTigerRecoveryUnit::_beginTransactionAtLastAppliedTimestamp(WT_SESSION* session) {
     auto lastApplied = _sessionCache->snapshotManager().getLastApplied();
     if (!lastApplied) {
@@ -609,6 +610,8 @@ Timestamp WiredTigerRecoveryUnit::_beginTransactionAtLastAppliedTimestamp(WT_SES
         return Timestamp();
     }
 
+
+    stdx::lock_guard<Latch> lock(_lastAppliedTxnMutex);
     WiredTigerBeginTxnBlock txnOpen(session,
                                     _prepareConflictBehavior,
                                     _roundUpPreparedTimestamps,

is related to

SERVER-55030 Remove mutexes that serialize secondary and majority read operations

Closed

related to

WT-6709 Remove timestamp queues that used to store read/durable timestamps

Closed

Assignee:: Louis Williams
Reporter:: Louis Williams
Participants:: Githook User, Louis Williams
Votes:: 0 Vote for this issue
Watchers:: 20 Start watching this issue

Created:: Sep 18 2020 02:09:30 PM UTC
Updated:: Jan 10 2024 10:53:09 PM UTC
Resolved:: Sep 21 2020 02:01:14 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates