Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-75564

Avoid executing DocumentSourceInternalUnpackBucket::doOptimizeAt() twice for sharded time-series collections

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • Query Integration
    • QI 2023-05-01, QI 2023-05-15, QI 2023-05-29, QI 2023-06-12, QI 2023-06-26, QI 2023-07-10, QI 2023-07-24, QI 2023-08-07, QI 2023-08-21, QI 2023-09-04, QI 2023-09-18, QI 2023-10-02, QI 2023-10-16, QI 2023-10-30, QI 2023-11-13, QI 2023-11-27, QI 2023-12-11, QI 2023-12-25, QI 2024-01-08, QI 2024-01-22, QI 2024-02-05

    Description

      For queries on sharded views, we do the optimization on the pipeline twice. The first time, when the query is sent on the view definition to the primary shard, and the second time when the query is send on the base collection (after the kickback to mongos). For normal views, this is ok, because the pipeline optimizations are generally idempotent. But the DocumentSourceInternalUnpackBucket::doOptimizeAt() may not idempotent. There are known issues (like SERVER-60373), where this function generates duplicates stages. We should use this ticket to evaluate any other potential issues with running the function twice and address them.

      Attachments

        Activity

          People

            backlog-query-integration Backlog - Query Integration
            naama.bareket@mongodb.com Naama Bareket
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: