[SERVER-59505] Time-series query on mixed, nested measurements can miss some events Created: 23/Aug/21 Updated: 29/Oct/23 Resolved: 02/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.2.0, 5.0.4, 5.1.0-rc3 |
| Type: | Bug | Priority: | Blocker - P1 |
| Reporter: | David Percy | Assignee: | Sam Mercier |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | query-director-triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v5.1, v5.0
|
||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | QO 2021-09-06, QO 2021-10-04, QO 2021-10-18, QO 2021-11-01, QO 2021-11-15 | ||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 120 | ||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Normally when a measurement field 'x' contains an array, the 'control.min.x' field on the bucket contains the min for each position in the array.
But if 'x' also contains non-arrays, then 'control.min.x' or 'control.max.x' will be a non-array:
The predicate pushdowns don't account for this, so "multikey" queries can incorrectly exclude this bucket:
A similar thing can happen if 'x' is a mixture of objects and non-objects:
This happens because although 'control.max.x' is the max of 'x', 'control.max.x.y' is not the max of 'x.y'. ('control.max.x.y' is 'missing', but 'missing' < ISODate.) |
| Comments |
| Comment by Githook User [ 02/Nov/21 ] |
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: |
| Comment by Githook User [ 02/Nov/21 ] |
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: |
| Comment by Githook User [ 02/Nov/21 ] |
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: |
| Comment by Githook User [ 29/Oct/21 ] |
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: |
| Comment by Githook User [ 27/Oct/21 ] |
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: |
| Comment by David Percy [ 23/Aug/21 ] |
|
Ideally the bucket format would include more information in the 'control' fields, but given the current format we need to fix the query rewrites. Instead of converting 'x < 10' to 'control.min.x < 10', we can generate something like 'control.min.x < 10 or (control.min.x < any array value < control.max.x)'. But we also need to consider:
|