[SERVER-56442] Generated tasks should not be considered started before their dependencies complete Created: 06/Jun/20  Updated: 29/Oct/23  Resolved: 28/Apr/21

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Andrew Morrow (Inactive) Assignee: David Bradford (Inactive)
Resolution: Fixed Votes: 0
Labels: task-lifecycle
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Sprint: DAG 2021-05-03
Participants:
Story Points: 2

 Description   

In the current server build most generated tasks, if not all, depend on the compile task completing so that their artifacts are available. However, those tasks are considered started immediately after the generation phase. This is a little perplexing visually, where all of the test suites that contain generated tasks appear to be running immediately when a patch build starts. That is a somewhat minor UX concern. The more significant consequence is that the wall clock time for the task is rendered inaccurate, because the clock starts when compile starts, not when it ends.

See, for instance:

The sharded_causally_consistent_jscore_txns_passthrough suite is shown has having a wall clock time of 29 minutes. However, this is inaccurate, because the clock started at the same time as compile, at 06:41. Had the clock started when tests actually began to execute, the wall clock time would I think have been something more like 10 minutes.

This double counting of time makes it difficult to accurately A/B compare performance of builds because changes in the duration of the compile task are accounted for multiple times.

In the absence of EVG-2535 there doesn't seem to be a way to model the difference without collecting and parsing logs for every task. Addressing one, or ideally both, of these issues would provide much more accurate tools with which to understand build performance.



 Comments   
Comment by Githook User [ 28/Apr/21 ]

Author:

{'name': 'David Bradford', 'email': 'david.bradford@mongodb.com', 'username': 'dbradf'}

Message: SERVER-56442: Group _gen tasks in a single display task
Branch: master
https://github.com/mongodb/mongo/commit/fcd06dab3d15dbdeb0631d8f96b78a76980f8a25

Comment by Brian Samek [ 07/Oct/20 ]

david.bradford - EVG-8318 is complete, so I'm going to assign this ticket to your team's backlog.

Comment by Brian Samek [ 18/Jun/20 ]

jonathan.brill - That would delay when generators start, and since they're serialized, that could mean that tasks start much later, causing makespans to increase.

Comment by Jonathan Brill [ 18/Jun/20 ]

Is it not possible for the generator to depend on compile?

Comment by Brian Samek [ 18/Jun/20 ]

I created EVG-8318 for that and am linking this ticket to that one.

Comment by David Bradford (Inactive) [ 10/Jun/20 ]

If there was a link on a generated task to the task that generated it, I think that would be acceptable. I do want to think through if there are any other implications to moving all the generating tasks to one display group. Right now, the debugability issues is the only one I can think of.

Comment by Brian Samek [ 10/Jun/20 ]

If all the "_gen" tasks were in one display task holder, we would be relying on naming conventions to determine which task created which sub-tasks because I don't think that information is exposed anywhere.

This information is exposed in the API as the "generated_by" field. Would you be able to bucket the generators if that information were exposed in the UI?

Comment by David Bradford (Inactive) [ 09/Jun/20 ]

We can consider that. It is occasionally very useful to look at the logs of the "_gen" task to determine how or why tasks were split the way they were. When that does come up, having all the tasks in the same display task is useful. If all the "_gen" tasks were in one display task holder, we would be relying on naming conventions to determine which task created which sub-tasks because I don't think that information is exposed anywhere.

Comment by John Liu (Inactive) [ 09/Jun/20 ]

I see the problem, though I can't think of a solution that would address this without making something else more complicated. david.bradford has your team considered splitting generators from the display tasks they generate, perhaps grouping them into a display task of just generators? That would address this problem, though it may cause problems elsewhere

Generated at Thu Feb 08 05:39:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.