Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-75260

Protocol to pass query shape data from mongot to mongod

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration

      Context here:

      Problem
      As part of the query telemetry project we want to capture the query shapes for find and aggregation queries - part of that includes computing the query shape for $search aggregation stages. The server does not know of the grammar of $search queries.

       

      Possible Solutions

      There are three possible solutions:

      (1) A new or enhanced protocol between mongot and mongod to have mongot compute the query shape,

      (2) mongod learns about the grammar of mongot,

      (3) "other", maybe mongot collects its own telemetry and mongod or an external consumer groups/joins them together. 

       

      Notes from Slack:

      • [Charlie] I think #2 is not a good option because (a) mongod's release schedule is different than mongot's, (b) we are not the experts on the $search language, and this would be a whole duplicated code path with the sole purpose of exposing telemetry metrics. (3) sounds weird, so I think (1) is the leading candidate, and to mitigate perf concerns, we'd probably ask for the shape to be computed and return along with every cursor. This is probably a non-negligible amount of work for the $search team to do (no idea but I'd guess 5-8 weeks? depends on the grammar complexity), and a small amount of work for us to do (~3 weeks is my first guess?)
      • [Oren] Having "general" query shapes in search would be super valuable. For instance, we learned recently that having a range AND other filter can be greatly improved, and we don't really know how many people query using this pattern.
      • Even without thinking of other agg stages after $search.
      • If we keep the protocol of mongot reports query shape schema-less. Then we could always refine in the future and it may not be a huge effort for us. For instance, we can start by just returning an integer counting how many operators were used. Which will take 1 day not including designing the protocol.
      • [Oren] Any surprises in sharded clusters? Or is it okay to "report" the query shape multiple times? 
      • [Ted] We'd probably have to have mergeCursors pick one and throw the rest away, or some other merge logic. Would require some work, but I don't think/hope it would be very difficult work
      • [Charlie] Uhh for a sharded collection should we get the shape back as part of the planShardedSearch command? Or whatever we called that?

            Assignee:
            Unassigned Unassigned
            Reporter:
            colby.ing@mongodb.com Colby Ing
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: