Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Total Hours with Assigned Team:
21,543.215
Sprint:
StorEng - Refinement Pipeline
Story Points:
None

mathias@mongodb.com pointed out that we could utilize a feature of the Intel X86 CPU implementation to optimize our crc32 implementation.

On x86 (at least on intel) the story is a bit different. There, the crc32c instruction has a 3 cycle latency, but 3 can execute in parallel as long as they are independent, so the standard pattern is to do each chunk as 3 parallel streams and do a merge operation. See ~~WT-2121~~ for some discussion. It may be worth reopening that and trying again since you are already set up to test it. You can find a BSD-or-GPL licensed implementation written by intel engineers linked to from that ticket, but to save you a hop, here you go.

This ticket is to explore the feasibility of this option and see how much performance gain we can potentially achieve.

related to

WT-12011 Speed up crc32c on arm64

Closed

Assignee:: [DO NOT USE] Backlog - Storage Engines Team
Reporter:: Chenhao Qu
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Jan 17 2024 01:29:51 AM UTC
Updated:: Jan 22 2024 12:59:31 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates