[SERVER-51120] Find queries with SORT_MERGE incorrectly sort the results when the collation is specified Created: 24/Sep/20 Updated: 29/Oct/23 Resolved: 02/Oct/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0, 4.4.2, 4.2.11, 4.0.21, 3.6.21 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Mindaugas Malinauskas | Assignee: | Mindaugas Malinauskas |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | KP42, KP44 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v4.4, v4.2, v4.0, v3.6
|
||||||||||||
| Steps To Reproduce: |
|
||||||||||||
| Sprint: | Query 2020-10-05, Query 2020-10-19 | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
Issue Status as of Sep 30, 2020 ISSUE DESCRIPTION AND IMPACT Certain queries involving a collated index and sorting on the collated fields can return incorrectly sorted results. This impacts $or and $in operations where each clause is indexed in a way that does not require a blocking sort. The set of results is accurate, but the documents are not ordered correctly. The bug is that the SORT_MERGE query plan stage incorrectly interprets the collated index keys. DIAGNOSIS AND AFFECTED VERSIONS All versions are affected. You can confirm a collation query is affected by using explain(). If a SORT_MERGE stage is in the query plan, and at least one child of the SORT_MERGE stage is an IXSCAN without a FETCH, then the query can return incorrectly sorted results. REMEDIATION AND WORKAROUNDS This issue will be corrected in the 4.4.2, 4.2.11, 4.0.21, and 3.6.21 versions of MongoDB. Until these versions are released, the main workaround is to sort results on the client side until a fix version is released. original descriptionGiven a compound index with a non-simple collation, a multi-point query on a prefix fields together with a sort on a prefix of suffix fields produces incorrect sort results when a collation is specified that matches the collation of the index and a sort merge (SORT_MERGE) is selected in the query execution plan. For example, if a collection has a compound index with a non-simple collation on fields a, b, c, d, then a find command with a matching collation specified that has a multi-point query on a and b fields and is sorted on field c will not produce correctly sorted result. |
| Comments |
| Comment by Mindaugas Malinauskas [ 30/Oct/20 ] |
|
Replaced "merge sort" with "sort merge". |
| Comment by Mindaugas Malinauskas [ 05/Oct/20 ] |
|
Corrected fix version. |
| Comment by Githook User [ 05/Oct/20 ] |
|
Author: {'name': 'Mindaugas Malinauskas', 'email': 'mindaugas.malinauskas@mongodb.com'}Message: (cherry picked from commit eee7fb8f2c6da144e9d4c3df7887a5ec167f3a6f) |
| Comment by Githook User [ 04/Oct/20 ] |
|
Author: {'name': 'Mindaugas Malinauskas', 'email': 'mindaugas.malinauskas@mongodb.com'}Message: (cherry picked from commit eee7fb8f2c6da144e9d4c3df7887a5ec167f3a6f) |
| Comment by Githook User [ 04/Oct/20 ] |
|
Author: {'name': 'Mindaugas Malinauskas', 'email': 'mindaugas.malinauskas@mongodb.com'}Message: (cherry picked from commit eee7fb8f2c6da144e9d4c3df7887a5ec167f3a6f) |
| Comment by Githook User [ 04/Oct/20 ] |
|
Author: {'name': 'Mindaugas Malinauskas', 'email': 'mindaugas.malinauskas@mongodb.com'}Message: (cherry picked from commit eee7fb8f2c6da144e9d4c3df7887a5ec167f3a6f) |
| Comment by Githook User [ 02/Oct/20 ] |
|
Author: {'name': 'Mindaugas Malinauskas', 'email': 'mindaugas.malinauskas@mongodb.com'}Message: |
| Comment by Eric Sedor [ 30/Sep/20 ] |
|
The following enumerates the exact conditions under which find() evinces the issue:
The root cause of the issue is that for the above conditions, the IXSCAN stages produce field values in the index which have been converted to collation keys and sorts are performed correctly. In the MERGE_SORT stage the collation keys are converted twice, resulting in an incorrect sort. |