Codify join optimizer’s management of path arrayness metadata

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Critical - P2
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Today, the join optimizer uses multiple ExpressionContexts. AggJoinModel::constructJoinModel clones the incoming ExpressionContext. 

      SamplingEstimatorImpl::makeEmptyCanonicalQuery creates a fresh ExpressionContext.

      Each ExpressionContext maintains its own _nonArrayPathsForNss, a map from NamespaceString to sets of paths that are assumed to not be arrays.

      This set of assumptions is passed into queries for runtime checking.

      Because join optimization and sampling CE each create their own ExpressionContexts, the canPathBeArrayForNss lookups do not append to the nonArrayPathsForNss in the ExpressionContext that is later passed into query execution runtime.

      Concurrent inserts of array values can result in tasserts in NDV, and (I think) wrong results in joins.

      The task of this ticket is to create a structured mechanism to pass the arrayness assumptions made during QO into QE.

      Some possibilities include

      • Explicitly passing NonArrayPathsForNss between ExpressionContexts during QO.
      • Passing the original ExpressionContext into all QO layers and ensure all canPathBeArrayForNss calls are made on the original ExpressionContext.

      A prototype is available at https://github.com/10gen/mongo/pull/54385/changes.

            Assignee:
            Naafiyan Ahmed
            Reporter:
            Evan Bergeron
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: