[SERVER-79755] Investigate lowering to SBE non-distinct_scan lastpoint queries Created: 04/Aug/23  Updated: 29/Oct/23  Resolved: 28/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Task Priority: Major - P3
Reporter: Irina Yatsenko (Inactive) Assignee: Irina Yatsenko (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Backwards Compatibility: Fully Compatible
Sprint: QI 2023-10-02
Participants:

 Description   

> db.createCollection("ts", { timeseries: { timeField: "time", metaField: "meta", granularity: "seconds" }})
> db.ts.getIndexes()
[{"v" : 2, "key" : {"meta" : 1,"time" : 1},"name" : "meta_1_time_1"}]
> db.system.buckets.ts.getIndexes()
[{"v" : 2,"key" : {"meta" : 1,"control.min.time" : 1,"control.max.time" : 1},"name" : "meta_1_time_1"}]
> // insert some data into "ts" so it's not empty
> db.ts.explain().aggregate([{$sort: {meta: 1, time: 1}}, {$group: {_id: "$meta", last: {$last: "$val"}}}])

The plan won't have DISTINCT_SCAN because the default index {meta:1, time: 1} doesn't match the query sorting requirements, but it will have a bucket-level $group with $last accumulator, followed by unpack, followed by event-level $group.

A pipeline like this should be fully lowerable to SBE but I'm not seeing the bucket-level $group being lowered in my tests. We should investigate this and either keep the whole query in the classic engine or ensure that it gets lowered beyond the unpacking stage. This is likely related to SERVER-79066.



 Comments   
Comment by Githook User [ 28/Sep/23 ]

Author:

{'name': 'Irina Yatsenko', 'email': 'irina.yatsenko@mongodb.com', 'username': 'IrinaYatsenko'}

Message: SERVER-79755 Mark the bucket-level group for lastpoint optimization as SBE compatible
Branch: master
https://github.com/mongodb/mongo/commit/01994cebec09d21796927bed5faba8347bcc5698

Comment by Irina Yatsenko (Inactive) [ 27/Sep/23 ]

While the sort is on time, the lastpoint optimization adds a bucket-level sort prior to unpacking and it gets picked up in pipeline_d (so, as a result, bounded sort isn't considered). There is also another place where a group is being created for lastpoint and I missed it. All this is rather messy already but the extra problem is that the optimization also adds a grouping stage at the bucket level prior to unpacking so the lowered unpack stage finds an SBE object rather than a BSONObject in the bucket slot and blows up.

For now I'll do the following:
1. mark the newly added grouping stages properly for SBE compatibility so we don't need to re-investigate this
2. mark the pipeline, optimized for lastpoint, as incompatible with SBE so that we lower nothing and keep the status quo both for correctness and performance

Comment by Irina Yatsenko (Inactive) [ 26/Sep/23 ]

Because the $sort is on the timeField, we will keep blocking these pipelines from lowering to SBE because as part of not supporting potential streaming group and bounded sort optimizations yet. However, I've added "SbeCompatibility::fullyCompatible" flag on the bucket-level group and confirmed that the whole pipeline would be lowered if the guard for $sort is removed. I think we should take this fix even if we won't be lowering these pipelines to SBE just yet to avoid surprises in the future.

Comment by Irina Yatsenko (Inactive) [ 07/Sep/23 ]

The bucket-level group is created in createBucketGroupForReorder() and is marked as non-SBE-compatible by default. Would need a similar fix to SERVER-79066, however, at the moment the whole pipeline is blocked from lowering due to the presence of the $sort stage.

Generated at Thu Feb 08 06:41:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.