Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.0.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
Backwards Compatibility:
Fully Compatible
Sprint:
QO 2024-03-04, QO 2024-03-18, QO 2024-04-01
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Deduplication of record ids in index scans can use unbounded memory in the query subsystem. While we consider longer term solutions to this problem, we should consider whether we can gain some immediate benefit from choosing a different data structure for deduplication.

From a related slack thread on using roaring bitmaps:

I read the first half or so of this paper, and it looks encouraging. Briefly, the space of 32-bit integers is divided into 2^16 groups of consecutive integers. If more than 1/4 of the integers in any given group are “present”, then the bitmap will represent membership of the set in that group more efficiently than a hashtable. Otherwise, roaring uses a sorted array of integers to represent that group, which is about as much storage as a hash table. A further optimization involves not having to represent groups at all when they are empty or full (I think?) and a further optimization lets you represent long runs of empty or full using run-length encoding.

So, if I read this right, roaring bitmaps would save us space if, in the cases where record id sets get very large, it’s common for the record ids present in the set to be densely packed as indicated above. Otherwise, the roaring set will be (approximately) no worse than an idealized hash table for RAM consumption.

is related to

SERVER-24375 Deduping in OR, SORT_MERGE, and IXSCAN (multikey case) uses unbounded memory

Backlog

split to

SERVER-87976 Use roaring bitmaps for RecordId deduplication in IndexScan stage

Closed

Assignee:: Alexander Ignatyev
Reporter:: Matt Broadstone
Participants:: Alexander Ignatyev, Githook User, Matt Broadstone
Votes:: 1 Vote for this issue
Watchers:: 22 Start watching this issue

Created:: Feb 27 2024 03:40:44 PM UTC
Updated:: Nov 12 2024 05:03:53 PM UTC
Resolved:: Mar 27 2024 10:22:42 PM UTC
Confidence Status Last Update:: 29/Feb/24 9:47 AM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates