[SERVER-75564] Avoid executing DocumentSourceInternalUnpackBucket::doOptimizeAt() twice for sharded time-series collections Created: 31/Mar/23  Updated: 02/Feb/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Naama Bareket Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qi-timeseries
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Sprint: QI 2023-05-01, QI 2023-05-15, QI 2023-05-29, QI 2023-06-12, QI 2023-06-26, QI 2023-07-10, QI 2023-07-24, QI 2023-08-07, QI 2023-08-21, QI 2023-09-04, QI 2023-09-18, QI 2023-10-02, QI 2023-10-16, QI 2023-10-30, QI 2023-11-13, QI 2023-11-27, QI 2023-12-11, QI 2023-12-25, QI 2024-01-08, QI 2024-01-22, QI 2024-02-05
Participants:

 Description   

For queries on sharded views, we do the optimization on the pipeline twice. The first time, when the query is sent on the view definition to the primary shard, and the second time when the query is send on the base collection (after the kickback to mongos). For normal views, this is ok, because the pipeline optimizations are generally idempotent. But the DocumentSourceInternalUnpackBucket::doOptimizeAt() may not idempotent. There are known issues (like SERVER-60373), where this function generates duplicates stages. We should use this ticket to evaluate any other potential issues with running the function twice and address them.


Generated at Thu Feb 08 06:30:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.