[SERVER-45544] burn_in_tests for certain tests can time out regardless of what changed Created: 13/Dec/19  Updated: 29/Oct/23  Resolved: 24/Jan/20

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.2.4

Type: Bug Priority: Major - P3
Reporter: Maria van Keulen Assignee: David Bradford (Inactive)
Resolution: Fixed Votes: 0
Labels: tig-burnin
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.2
Sprint: DAG 2020-01-27
Participants:
Story Points: 3

 Description   

I recently worked on a ticket that modified the create_index_background_unique_collmod.js workload. I noticed that, even doing a patch build with a variable name change on the test would cause burn_in_tests to time out, possibly due to this test not interfacing well with the way burn_in_tests is run.



 Comments   
Comment by Githook User [ 24/Jan/20 ]

Author:

{'username': 'dbradf', 'name': 'David Bradford', 'email': 'david.bradford@mongodb.com'}

Message: SERVER-45544: Add burn_in_tags to generate tests for test tags
Branch: v4.2
https://github.com/mongodb/mongo/commit/14986e6fec472e9ff4359f3da3bb09a66dcf64b5

Comment by David Bradford (Inactive) [ 14/Jan/20 ]

A few months ago, we added a burn_in_tags.py script to dynamically run burn_in_tests on build variants that were just created to run burn_in_tests (like inmem). It looks like `burn_in_tags` is behaving correctly in this case, but it is only on the master branch. And this problem was hit on the 4.2 branch. I'm going to use this ticket to backport burn_in_tags and all the fixes that have been made for it to the 4.2 branch.

Comment by David Bradford (Inactive) [ 13/Jan/20 ]

When we do burn_in_tests for the special case build variants (like inmem), the burn_in_tests scripts is looking at two build_variants. In this case, it is looking at enterprise-rhel-64-62-bit and the inmem variant. It uses some details from each variant to determine which tests to run and what configurations to run it in. For distros, it is using the wrong one. It is currently using the distro for enterprise-rhel-64-62-bit, when it should be using the distro for the inmem configuration (and in this case, the distros are different and causing problems).

So, we need to update the code to look at the inmem build_variant when selecting the distro.

Comment by David Bradford (Inactive) [ 13/Jan/20 ]

Looking at this closer, the clean up is happening correctly. It looks like we are hitting SERVER-42440. In the patch build, the tests are being run on a rhel62-small machine, but they should be run on a rhel62-large machine. It looks like resource limits are causing the problem.

So we need to figure out why it is not running on the correct distro.

Comment by David Bradford (Inactive) [ 16/Dec/19 ]

Looking at the failures, it looks like validation is taking an unexpectedly long time. I believe in normal test runs we will clean up data at certain intervals to ensure the validation does not take long periods of time. We might need to add something like that to burn_in_tests.

Generated at Thu Feb 08 05:09:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.