Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Querying
Labels:
None

Assigned Teams:

Query Execution
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The CollatorInterface provides two mechanisms for comparing strings in a collation-aware fashion: compare() and getComparisonKey(). The former takes two strings and returns the result of the comparison. The latter returns an array of bytes such that memcmp against another comparison key yields the same results as compare().

The ICU documentation (see Sortkeys vs Comparison here) notes that generating an ICU comparison key is many times more expensive than doing a direct comparison. Profiles captured from mongod's integration with ICU confirms this to be the case. However, memcmp is also cheaper than compare(). This means that comparison keys should be used when you expect to compare a string repeatedly (say, hundreds of times), whereas direct comparison should be used in other cases. For example, we generate and store comparison keys in indexes, since we want to be able to repeatedly make cheap comparisons against these keys.

When performing an in-memory SORT, we currently generate comparison keys for all of the strings to be sorted, and then sort them with memcmp. My experiments show that, especially for small in-memory sorts, it is faster to sort via direct comparison. It would probably take a very large in-memory sort to cross the threshold, such that the repeated calls to CollatorInterface::compare() for the average element exceed the cost of generating the comparison key for that element.

related to

SERVER-26129 Investigate perf overhead of collation

Closed

Assignee:: [DO NOT USE] Backlog - Query Execution
Reporter:: David Storch
Participants:: [DO NOT USE] Backlog - Query Execution, David Storch
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Oct 13 2016 06:04:27 PM UTC
Updated:: Dec 06 2022 04:14:09 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates