Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- qi-quick-win-candidate
- qi-timeseries

Assigned Teams:

Query Integration
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Came from HELP-71156. Full repro in comments. $unionWith correctly uses an index when run against a standard (non-time-series) collection:

db.waterMeasurements.createIndex(
    { "metadata.systemId": 1, "metadata.dataType": 1, "metadata.index": 1, "timestamp": 1 });

// pipeline
{
            "$unionWith": {
                "coll": waterMeasurements,
                "pipeline": [
                    {
                        "$match": {
                            "metadata.systemId": 5578,
                            "metadata.dataType": "DHW_TANK_TEMPERATURE",
...

// explain("executionStats") for the normal collection.
{
    "$unionWith": {
        "coll": "waterMeasurements",
        "pipeline": [
            {
                "$cursor": {
                    "queryPlanner": {
                        "namespace": "test.waterMeasurements",
                        ....
                        "winningPlan": {
                            ...
                                "inputStage": {
                                    "stage": "FETCH",
                                    "inputStage": {
                                        "stage": "IXSCAN",
                                        "keyPattern": {
                                            "metadata.systemId": 1,
                                            "metadata.dataType": 1,
                                            "metadata.index": 1,
                                            "timestamp": 1
...

Whereas for some reason, the same $unionWith query on a time-series collection shows that a COLLSCAN is performed

db.waterMeasurementsTS.createIndex(     { "metadata.systemId": 1, "metadata.dataType": 1, "metadata.index": 1, "timestamp": 1 });

// explain("executionStats") for a time-series collection.
{
    "$unionWith": {
        "coll": "system.buckets.waterMeasurementsTS",
        "pipeline": [
            {
                "$cursor": {
                    "queryPlanner": {
                       "namespace": "test.system.buckets.waterMeasurementsTS",
                        ...
                        "winningPlan": {
                                ...
                                "inputStage": {
                                    "stage": "COLLSCAN",

The COLLSCAN shouldn't be necessary since we can use the index we created. We also could have used the automatically-created compound index on metaField and the timeField to skip ineligible buckets.

We should change this to use an existing index when inside a $unionWith subpipeline to skip unnecessary processing.

Assignee:: Unassigned
Reporter:: Erin Liang
Participants:: Erin Liang
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Feb 20 2025 01:50:07 PM UTC
Updated:: Mar 27 2025 06:36:59 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates