[GODRIVER-2071] Speed up Go driver patch builds, including integration tests Created: 07/Jul/21  Updated: 17/Jan/23

Status: Needs Triage
Project: Go Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Epic Priority: Unknown
Reporter: Matt Dale Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Quarter: FY24Q1
Start date:

 Description   

Summary

Various Go driver test variants currently take 10-15 minutes to complete. A full PR patch build typically takes 25-30 minutes to complete. As we add more tests, our sensitivity to test timeouts will increase and the timeliness of patch build results will decrease. We need to find a way to allow increasing test coverage without further increasing the runtime of the tests. Some possible improvements:

  1. Scale up test runner hardware to use faster CPUs.
  2. Scale up local MongoDB clusters to use more replicas or shards to decrease operation durations.
  3. Scale up test Atlas clusters to use more replicas or shards to decrease operation durations.
  4. Run more tests in parallel to reduce overall test run duration.

Motivation

Who is the affected end user?

The Go Driver team is the primary group affected.

How does this affect the end user?

The Go Driver team currently spends time waiting for Evergreen patch builds, troubleshooting test failures caused by timeouts, and increasing test timeouts.

How likely is it that this problem or use case will occur?

We've opened two tickets related to tests consistently failing in the last 6 months:

If the problem does occur, what are the consequences and how severe are they?

If a test times out, the resolution is usually to run it again and hope it runs faster. If that doesn't work, someone must investigate the root cause of the timeout, which may end up being that "the tests just take a long time." If the tests timed out because they're just slow, someone needs to update the test configuration to allow a longer timeout.

If the tests take a long time but don't timeout, Go Driver team members are stuck waiting for test results, possibly for hours or up to days depending on how many times unreliable tests must be run.

Is this issue urgent?

There are currently some consistent test timeouts on MacOS (GODRIVER-2070), although we don't know if the root cause of that problem is a specific problem or just general test slowness.

Is this ticket required by a downstream team?

No.

Is this ticket only for tests?

Test and development efficiency improvements.

Cast of Characters

Engineering Lead: ?
Document Author: matt.dale
POCers: ?
Product Owner: ?
Program Manager: ?
Stakeholders: Go Driver team


Generated at Thu Feb 08 08:37:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.