[SERVER-62459] Timeseries collection with collation doesn't set bounds on Collscan when looking up by time field Created: 10/Jan/22  Updated: 27/Oct/23  Resolved: 12/Jan/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Daniel Gomez Ferro Assignee: Ian Boros
Resolution: Works as Designed Votes: 0
Labels: time-series
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

db.weather.drop();      
db.createCollection(
    "weather",
    {
       timeseries: {
          timeField: "timestamp",
          metaField: "metadata",
          granularity: "seconds"
       }, 
       collation: {locale: "en", strength: 2}
    }
);
db.weather.find({
   "timestamp": ISODate("2021-05-18T01:00:00.000Z")
}).explain()

Sprint: Execution Team 2022-01-24
Participants:

 Description   

While working on SERVER-61260 I’ve noticed lookups by the time field on timeseries collections that have a collation don’t set bounds for the COLLSCAN, example:

db.weather.drop();      
db.createCollection(
    "weather",
    {
       timeseries: {
          timeField: "timestamp",
          metaField: "metadata",
          granularity: "seconds"
       }, 
       collation: {locale: "en", strength: 2}}
    }
);
db.weather.find({
   "timestamp": ISODate("2021-05-18T01:00:00.000Z")
}).explain()
{
        "explainVersion" : "2",
        "queryPlanner" : {
                "namespace" : "test.weather",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "timestamp" : {
                                "$eq" : ISODate("2021-05-18T01:00:00Z")
                        }
                },
                "queryHash" : "911AD713",
                "planCacheKey" : "911AD713",
                "maxIndexedOrSolutionsReached" : false,
                "maxIndexedAndSolutionsReached" : false,
                "maxScansToExplodeReached" : false,
                "winningPlan" : {
                        "queryPlan" : {
                                "stage" : "COLLSCAN",
                                "planNodeId" : 1,
                                "filter" : {
                                        "timestamp" : {
                                                "$eq" : ISODate("2021-05-18T01:00:00Z")
                                        }
                                },
                                "direction" : "forward"
                        },
                        "slotBasedPlan" : {
                                "slots" : "$$RESULT=s5 $$RID=s6 env: { s3 = Timestamp(1641804498, 4) (CLUSTER_TIME), s1 = TimeZoneDatabase(Asia/Kuching...Australia/Perth) (timeZoneDB), s2 = Nothing (SEARCH_META), s4 = 1641804499092 (NOW) }",
                                "stages" : "[1] filter {fillEmpty (s9, false)} \n[1] traverse s9 s8 s7 [s5, s6] {s9 || s8} {s9} \nfrom \n    [1] project [s7 = getField (s5, \"timestamp\")] \n    [1] scan s5 s6 none none none none [] @\"cf1687e2-56d2-4971-b19e-ec89d2391f73\" true false \nin \n    [1] project [s8 = fillEmpty (s7 == 1621299600000, false)] \n    [1] limit 1 \n    [1] coscan \n"
                        }
                },
                "rejectedPlans" : [ ]
        },
        "command" : {
                "find" : "weather",
                "filter" : {
                        "timestamp" : ISODate("2021-05-18T01:00:00Z")
                },
                "$db" : "test"
        },
        "serverInfo" : {
                "host" : "ip-10-122-10-220",
                "port" : 27017,
                "version" : "5.3.0-alpha-48-g06692bd",
                "gitVersion" : "06692bdbfa912848a884c6440caab9cfb693e93c"
        },
        "serverParameters" : {
                "internalQueryFacetBufferSizeBytes" : 104857600,
                "internalQueryFacetMaxOutputDocSizeBytes" : 104857600,
                "internalLookupStageIntermediateDocumentMaxSizeBytes" : 104857600,
                "internalDocumentSourceGroupMaxMemoryBytes" : 104857600,
                "internalQueryMaxBlockingSortMemoryUsageBytes" : 104857600,
                "internalQueryProhibitBlockingMergeOnMongoS" : 0,
                "internalQueryMaxAddToSetBytes" : 104857600,
                "internalDocumentSourceSetWindowFieldsMaxMemoryBytes" : 104857600
        },
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1641804498, 4),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1641804498, 4)
}

If there's no collation the plan sets min/max bounds for the Collscan:

{
        "explainVersion" : "1",
        "stages" : [
                {
                        "$cursor" : {
                                "queryPlanner" : {
                                        "namespace" : "test.system.buckets.weather",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                                "$and" : [
                                                        {
                                                                "_id" : {
                                                                        "$lte" : ObjectId("60a31190ffffffffffffffff")
                                                                }
                                                        },
                                                        {
                                                                "_id" : {
                                                                        "$gte" : ObjectId("60a303800000000000000000")
                                                                }
                                                        },
                                                        {
                                                                "control.max.timestamp" : {
                                                                        "$_internalExprGte" : ISODate("2021-05-18T01:00:00Z")
                                                                }
                                                        },
                                                        {
                                                                "control.min.timestamp" : {
                                                                        "$_internalExprGte" : ISODate("2021-05-18T00:00:00Z")
                                                                }
                                                        },
                                                        {
                                                                "control.max.timestamp" : {
                                                                        "$_internalExprLte" : ISODate("2021-05-18T02:00:00Z")
                                                                }
                                                        },
                                                        {
                                                                "control.min.timestamp" : {
                                                                        "$_internalExprLte" : ISODate("2021-05-18T01:00:00Z")
                                                                }
                                                        }
                                                ]
                                        },
                                        "queryHash" : "D69A8284",
                                        "planCacheKey" : "D69A8284",
                                        "maxIndexedOrSolutionsReached" : false,
                                        "maxIndexedAndSolutionsReached" : false,
                                        "maxScansToExplodeReached" : false,
                                        "winningPlan" : {
                                                "stage" : "COLLSCAN",
                                                "filter" : {
                                                        "$and" : [
                                                                {
                                                                        "_id" : {
                                                                                "$lte" : ObjectId("60a31190ffffffffffffffff")
                                                                        }
                                                                },
                                                                {
                                                                        "_id" : {
                                                                                "$gte" : ObjectId("60a303800000000000000000")
                                                                        }
                                                                },
                                                                {
                                                                        "control.max.timestamp" : {
                                                                                "$_internalExprGte" : ISODate("2021-05-18T01:00:00Z")
                                                                        }
                                                                },
                                                                {
                                                                        "control.min.timestamp" : {
                                                                                "$_internalExprGte" : ISODate("2021-05-18T00:00:00Z")
                                                                        }
                                                                },
                                                                {
                                                                        "control.max.timestamp" : {
                                                                                "$_internalExprLte" : ISODate("2021-05-18T02:00:00Z")
                                                                        }
                                                                },
                                                                {
                                                                        "control.min.timestamp" : {
                                                                                "$_internalExprLte" : ISODate("2021-05-18T01:00:00Z")
                                                                        }
                                                                }
                                                        ]
                                                },
                                                "direction" : "forward",
                                                "minRecord" : ObjectId("60a303800000000000000000"),
                                                "maxRecord" : ObjectId("60a31190ffffffffffffffff")
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        }
                },
                {
                        "$_internalUnpackBucket" : {
                                "exclude" : [ ],
                                "timeField" : "timestamp",
                                "metaField" : "metadata",
                                "bucketMaxSpanSeconds" : 3600,
                                "assumeNoMixedSchemaData" : true
                        }
                },
                {
                        "$match" : {
                                "timestamp" : {
                                        "$eq" : ISODate("2021-05-18T01:00:00Z")
                                }
                        }
                }
        ],
        "serverInfo" : {
                "host" : "ip-10-122-10-220",
                "port" : 27017,
                "version" : "5.3.0-alpha-48-g06692bd",
                "gitVersion" : "06692bdbfa912848a884c6440caab9cfb693e93c"
        },
        "serverParameters" : {
                "internalQueryFacetBufferSizeBytes" : 104857600,
                "internalQueryFacetMaxOutputDocSizeBytes" : 104857600,
                "internalLookupStageIntermediateDocumentMaxSizeBytes" : 104857600,
                "internalDocumentSourceGroupMaxMemoryBytes" : 104857600,
                "internalQueryMaxBlockingSortMemoryUsageBytes" : 104857600,
                "internalQueryProhibitBlockingMergeOnMongoS" : 0,
                "internalQueryMaxAddToSetBytes" : 104857600,
                "internalDocumentSourceSetWindowFieldsMaxMemoryBytes" : 104857600
        },
        "command" : {
                "aggregate" : "system.buckets.weather",
                "pipeline" : [
                        {
                                "$_internalUnpackBucket" : {
                                        "timeField" : "timestamp",
                                        "metaField" : "metadata",
                                        "bucketMaxSpanSeconds" : 3600,
                                        "assumeNoMixedSchemaData" : true
                                }
                        },
                        {
                                "$match" : {
                                        "timestamp" : ISODate("2021-05-18T01:00:00Z")
                                }
                        }
                ],
                "cursor" : {
 
                },
                "collation" : {
 
                }
        },
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1641804443, 4),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1641804443, 4)
}



 Comments   
Comment by Ian Boros [ 12/Jan/22 ]

daniel.gomezferro
No worries at all

Comment by Daniel Gomez Ferro [ 12/Jan/22 ]

ian.boros yes, you are right, the setup code for the collection with collation had an extra '}' and the implicit collection creation bit me, still not used to it!

 

Sorry for wasting your time!

Comment by Ian Boros [ 12/Jan/22 ]

Hey daniel.gomezferro, I'm looking into this, and it appears that the query plan provided in the description is for a plan run against a non-time series collection. That is, there's no $_internalUnpackBucket, and the plan is run using a standard collection scan. This would suggest that the 'weather' collection was not actually a time series collection. I also ran the repro steps (copy-pasted below) but see that the query plan used has bounds on the COLLSCAN stage.

(function() {
    db.weather.drop();
    db.createCollection(
        "weather",
        {
            timeseries: {
                timeField: "timestamp",
                metaField: "metadata",
                granularity: "seconds"
            }, 
            collation: {locale: "en", strength: 2}
        }
    );
    printjson(db.weather.find({
        "timestamp": ISODate("2021-05-18T01:00:00.000Z")
    }).explain());
})();

Can you double-check/confirm that you can reproduce this bug?

Generated at Thu Feb 08 05:55:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.