Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT10.0.0, 4.9.0, 4.4.5
Affects Version/s: None
Component/s: None
Labels:
- KP44

Sprint:
Storage - Ra 2021-02-22, Storage - Ra 2021-03-08
Story Points:
8
Case:

Backport Requested:

v4.2

For high volumes of short-lived, timestamped reads, the global read timestamp queue does not handle excessive contention well.

From ~~SERVER-51041~~:

The WT read timestamp queue leaves around old entries from inactive transactions. New readers (holding write locks on the read timestamp queue) are responsible for cleaning up old entries even if the queue has hundreds of thousands of inactive entries. This then blocks out other readers, which spin wait for a moment, then start context switching wildly. Once the queue shrinks down, thousands of new read requests come in, but the problem just repeats itself. This leads to very unpredicatable latencies and poor CPU utilization.

I would like to consider an investigation into improvements we can make into this data structure and the concurrency control around it to support high rates of read transactions. This is not necesarily a large numbers of concurrent transactions. Even with the MongoDB ticketing mechanism, the read timestamp queue can grow massively for large numbers of short-lived reads.

My suggestion:

1. Use a different data structure with non-linear lookup time like an ordered map. Read timestamps are rarely random or uniform. They cluster to many transactions reading at the same points in time. With secondary reads (lastApplied) and majority reads, thousands of transactions will all read at very similar or all the same timestamp.

New reads increment a counter for their read timestamp, and when they roll-back, they decrement a counter for that timestamp. If they are the last active reader at that timestamp, they remove the entry entirely.

~~2. Don't use a spinlock for this data stucture~~

Just removing the spinlock would be simplest way to alleviate the performance problems, but I still think the queue is problematic because it means readers can hold a mutex for an unbounded period of time.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

durable-timestamp-queue-lock-contention.png
395 kB
Jan 29 2021 05:34:37 AM UTC
read timestamp patch build.png
94 kB
Feb 16 2021 11:06:31 AM UTC
Screen Shot 2020-09-28 at 12.47.41 PM.png
387 kB
Sep 28 2020 04:48:04 PM UTC

is related to

SERVER-51041 Throttle starting transactions for secondary reads

Closed

SERVER-55030 Remove mutexes that serialize secondary and majority read operations

Closed

WT-7281 Add metric to record total sessions scanned

Closed

related to

SERVER-55024 Complete TODO listed in WT-6709

Closed

Assignee:: Haribabu Kommi
Reporter:: Louis Williams
Votes:: 0 Vote for this issue
Watchers:: 23 Start watching this issue

Created:: Sep 18 2020 04:02:20 PM UTC
Updated:: Oct 29 2023 04:43:00 PM UTC
Resolved:: Mar 08 2021 12:29:51 AM UTC

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates