Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-97801

Overhaul system to determine the number of jobs to use per task

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Correctness

      Currently we have two parameters - resmoke_jobs_max and resmoke_jobs_factor - that can be set at the task level or on a build variant. These values are usually set in YAML. However, the current system is pretty flexible about whether it set at the task level, such as here or on an entire build variant. A consequence of setting the value in YAML is that as we create new tasks, we often copy-paste old values that are no longer applicable and apply it to newly created tasks. We've seen this happen with the concurrency_.* suites. They all have resmoke_jobs_max: 1, even though this value was set five years ago - the machines we're running on today may be very different from those we ran on five years ago. And obviously, the concurrency suites we have are very different from one another. concurrency, which runs against a standalone, places different demands on the host it runs on compared to concurrency_sharded_replication or concurrency_sharded_initial_sync, yet they all use the same value of resmoke_jobs_max: 1.

      Separately, we also have a script that runs and tries to dynamically set job counts in evergreen_resmoke_job_count.py. The script has a few issues as well - for instance, if a suite is renamed, the script will silently be ineffective against it.

      Ideally, we should do away with the constants that we've hardcoded in YAML and in the script, and instead, compute the number of jobs to allocate based on the specs of the machine and the demands of a task, which we can keep some history of.

      SERVER-97562 is creating a backdoor for tasks that run on required build variants because it was determined that we can speed up patch builds quite a bit by increasing the number of Resmoke jobs. In this ticket, we should get rid of that backdoor and solve the problem properly.

            Assignee:
            Unassigned Unassigned
            Reporter:
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: