Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Sprint:
None
Story Points:
None

For high volumes of short-lived reads and writes, ~~SERVER-55030~~ showed there's some overhead due to the custom "spin lock" used during snapshot creation: https://github.com/wiredtiger/wiredtiger/blob/322951cb18905cdea2ae3004906c8e8e4e27462a/src/txn/txn.c#L262-L285

It contends with the transaction ID allocation here: https://github.com/wiredtiger/wiredtiger/blob/ca27d1c1f1c616bf016d0e3854a59b91a5dec908/src/include/txn_inline.h#L1224-L1229

The performance degradation is around 4% throughput loss for the 50read50update YCSB workload using secondary reads, with 32 threads on a 16 CPU cluster, compared to serializing all snapshot creations with an explicit mutex.

My understanding is that if an allocating thread gets scheduled out, it could take a long time for it to resume execution because all threads creating a snapshot will be spinning on that loop and consuming all available CPUs.

My suggestion:

Instead of relying on WT_PAUSE, add an explicit backoff strategy that schedules out the blocked threads so that the allocating threads can make progress. ~~SERVER-55030~~ showed that a simple version of this strategy removes the regression for the affected workload.

Another alternative could be to create the snapshot on a single thread and share it with all concurrent snapshot creations.

duplicates

WT-9074 Reconsider spinlocks in transaction ID allocation path

Open

Assignee:: [DO NOT USE] Backlog - Storage Engines Team
Reporter:: Daniel Gomez Ferro
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Jan 11 2022 12:34:48 PM UTC
Updated:: Apr 05 2022 11:03:57 AM UTC
Resolved:: Apr 05 2022 11:03:57 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates