[SERVER-32311] getAggPlanStage() should de-duplicate sharded aggregation explain output Created: 13/Dec/17 Updated: 07/Nov/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Kyle Suarez | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | open_todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Query Execution
|
||||
| Participants: | |||||
| Description |
|
Running getAggPlanStage() on the explain output for an aggregation that targets multiple shards in a cluster fails, because it expects that exactly one stage with that name exists. We could make an improvement such that if the stage appears in the "shards" part, we assert that it appears identically on each shard and then return it. |
| Comments |
| Comment by Alexander Ignatyev [ 07/Nov/23 ] |
|
I believe a correct solution for this issue might be just using getAggPlanStages and then assert the plans in a loop, because, as Dave mentioned, there seems to be no correct solution for automatic aggregation of the stages across the shards. kyle.suarez@mongodb.com do you think we can close the ticket with this explanation because people continue to ban js tests from sharded collection test suites appealing to this ticket. |
| Comment by Charlie Swanson [ 06/Nov/19 ] |
|
Good with me! Sorry for the false alarm jacob.evans. |
| Comment by David Storch [ 05/Nov/19 ] |
|
Oh, you're right. I had assumed that this ticket was just about handling the sharded format, but you're correct that the suggestion is indeed to somehow deduplicate stage information across shards. It's not obvious to me that this automatic deduplication is good behavior, since it could be surprising to some callers. I suggest we return this ticket to the backlog and remove the "greenerbuild" label, since there is not clearly a quick win here to improve test coverage. |
| Comment by Charlie Swanson [ 01/Nov/19 ] |
|
Hmm but that code just looks like it appends them all to an array. This ticket looks to me about deduplicating the stages from different shards so we don't trip this assertion: https://github.com/mongodb/mongo/blob/8010c51d3f1cc1020959f8b351f8242d141d42a3/jstests/libs/analyze_plan.js#L249-L252 |
| Comment by David Storch [ 01/Nov/19 ] |
|
It wasn't obvious from the git blame when exactly the work happened. But you can see that getAggPlanStages() has handling for the sharded output format: |
| Comment by Charlie Swanson [ 01/Nov/19 ] |
|
david.storch I'm not seeing any indication that this work was done... Where do you see it? When did the work happen? |
| Comment by David Storch [ 16/Oct/19 ] |
|
It looks like this improvement has been made already in analyze_plan.js, but there are a bunch of TODOs referencing this ticket where we have blacklisted tests. In order to improve test coverage, we should investigate whether those tests can be unblacklisted in various passthrough suites. Flagging for triage so that we can decide whether we can to schedule this work. |
| Comment by Kyle Suarez [ 08/Feb/18 ] |
|
Ditto the query-version of this, getPlanStage(). |