[SERVER-32297] Aggregations that merge on mongos do not respect the collation Created: 13/Dec/17 Updated: 29/Jan/18 Resolved: 28/Dec/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 3.6.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kyle Suarez | Assignee: | Kyle Suarez |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Backport Requested: |
v3.6
|
||||||||||||||||||||||||||||||||
| Steps To Reproduce: | First, apply this patch to run with multiple shards:
and then run
|
||||||||||||||||||||||||||||||||
| Sprint: | Query 2017-12-18, Query 2018-01-01 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
In |
| Comments |
| Comment by Githook User [ 02/Jan/18 ] |
|
Author: {'name': 'Kyle Suarez', 'username': 'ksuarz', 'email': 'kyle.suarez@mongodb.com'}Message: (cherry picked from commit 79352e71b697cb8c126510095bba7fd816128701) |
| Comment by David Storch [ 28/Dec/17 ] |
|
The fix for this bug was committed under |
| Comment by Kyle Suarez [ 20/Dec/17 ] |
|
Alright, the problem is not in the AsyncResultsMerger but rather in DocumentSourceSort. If the sort requires a merge, DocumentSourceSort is responsible for serializing the sort key as a metadata field. Internally, it extracts sort keys one of two ways: a "fast path" for when there are no arrays along the path, and a "slow path" that uses the SortKeyGenerator. Unfortunately, only the SortKeyGenerator is collation aware – if we take the fast path, we generate sort keys with strings that have not been transformed into their ICU comparison keys. I'm going to run microbenchmarks on a patch that always uses the slow SortKeyGenerator path if we have a non-simple collation. If there's a noticeable performance regression, we can try something else. |