Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Zig-zag search naturally favors high-selectivity filters, however, if all filters are low selectivity, the perf of zig-zag search might be worse than iterating over the filtered columns in parallel. We should investigate how bad it becomes and whether we need to take any mitigating actions (e.g. replacing short-distance seeks with iterating either inside the stage or at the storage level).

—

From my experiments on small datasets, the cost of a "seek" is about twice the cost of a "next" which means that the filters have to have really low selectivity to benefit from replacing seeks with next (on a collection with 10^5 docs, a query with 7 always true filters and one filter that let through 30% of records performed about the same with seeks and nexts). The seeks would likely be more relatively expensive on indexes that don't fit into memory but we'd still have to use heuristics in the column_scan stage to pick between the two, such as the distance between the record ids, and I doubt the heuristics can be reliable enough to be useful.

Assignee:: Irina Yatsenko (Inactive)
Reporter:: Irina Yatsenko (Inactive)
Participants:: Irina Yatsenko
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Aug 02 2022 09:31:54 PM UTC
Updated:: Oct 19 2022 04:06:28 PM UTC
Resolved:: Oct 19 2022 04:06:28 PM UTC
Confidence Status Last Update:: 14/Oct/22 3:31 PM

Details

Description

Attachments

Activity

People

Dates