[SERVER-81390] HashAggStage fails to respect the collation when spilling to disk Created: 22/Sep/23 Updated: 10/Nov/23 Resolved: 27/Oct/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 7.0.2, 6.0.11, 7.1.0 |
| Fix Version/s: | 7.1.1, 7.2.0-rc0, 6.0.12, 7.0.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Justin Seyster | Assignee: | Foteini Alvanaki |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | query-director-triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v7.1, v7.0, v6.0
|
||||||||||||||||
| Sprint: | QE 2023-10-16, QE 2023-10-30 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 20 | ||||||||||||||||
| Description |
|
The HashAgg spill algorithm converts each group key to a KeyString but does not provide the conversion operation with a function to normalize the key according to the collator. As a result, keys that would be considered equal in the in-memory hash table are considered distinct in the spilled data. This updated jstest exercises the problem. The test fails normally but succeeds if the pipeline is forced to run in the Classic engine.
|
| Comments |
| Comment by Githook User [ 27/Oct/23 ] |
|
Author: {'name': 'Rui Liu', 'email': 'lriuui0x0@gmail.com', 'username': 'lriuui0x0'}Message:
(cherry picked from commit a19074b842b752ee0a61810e0b8f6d79c5aa80c1)
(cherry picked from commit d0811e844e8566dc276fcd73fceabec71c0e2717) |
| Comment by Githook User [ 27/Oct/23 ] |
|
Author: {'name': 'Rui Liu', 'email': 'lriuui0x0@gmail.com', 'username': 'lriuui0x0'}Message:
(cherry picked from commit a19074b842b752ee0a61810e0b8f6d79c5aa80c1)
(cherry picked from commit d0811e844e8566dc276fcd73fceabec71c0e2717) |
| Comment by Githook User [ 27/Oct/23 ] |
|
Author: {'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}Message: |
| Comment by Githook User [ 19/Oct/23 ] |
|
Author: {'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}Message: |
| Comment by Foteini Alvanaki [ 12/Oct/23 ] |
|
I confirmed that v6.0 , v7.0 and v7.1 are all affected by this bug. |
| Comment by Justin Seyster [ 26/Sep/23 ] |
|
ana.meza@mongodb.com, yes, there is a potential for incorrect output for any pipeline with a collation and a $group operation that both executes in SBE and needs to "spill" to disk. I haven't verified, but I believe the bug goes back to v5.2. |