[DRIVERS-1947] Kill workload executor when astrolabe exits with an error Created: 12/Oct/21  Updated: 28/Oct/23  Resolved: 18/Oct/21

Status: Closed
Project: Drivers
Component/s: Atlas Testing
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Oleg Pudeyev (Inactive) Assignee: Oleg Pudeyev (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Epic Link: Astrolabe Testing Improvements
Driver Changes: Not Needed
Quarter: FY22Q4

 Description   

Currently when astrolabe exits with an error it does not kill the workload executor, if one is running. This causes two types of issues:

  • On a local machine, the workload executor continues running in the background, potentially spamming terminal with warnings/errors if it cannot connect to cluster or perform an operation
  • In evergreen, the test run gets timed out because evergreen uses go and go waits for all processes in a spawned process tree to exit, not just the spawned process itself (https://github.com/golang/go/issues/20730)

[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.200850 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.205861 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.263242 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.268822 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.326601 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.330875 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.388785 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.393696 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.452531 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 17:22:12.455] W, [2021-10-08T17:22:12.455549 #22238]  WARN -- : MONGODB | Error running awaited hello on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017: Mongo::Error::SocketError: Errno::ECONNREFUSED: Connection refused - connect(2) for 54.176.135.219:27017 (for 54.176.135.219:27017 (2e694dd263-shard-00-02.80104.mongodb-qa.net:27017, TLS)) (on 2e694dd263-shard-00-02.80104.mongodb-qa.net:27017)
[2021/10/08 19:03:26.112] Command stopped early: context canceled
[2021/10/08 19:03:26.166] Running task-timeout commands.

To avoid these issues astrolabe should attempt to terminate the workload executor if it launched one and astrolabe is exiting with an error.



 Comments   
Comment by Oleg Pudeyev (Inactive) [ 18/Oct/21 ]

I haven't seen stuck tests in the most recent patch builds that incorporate the fix for this ticket.

Comment by Githook User [ 18/Oct/21 ]

Author:

{'name': 'Oleg Pudeyev', 'email': 'code@olegp.name', 'username': 'p'}

Message: DRIVERS-1947 Kill workload executor when astrolabe exits with an error
Branch: master
https://github.com/mongodb-labs/drivers-atlas-testing/commit/b58d5618f7660ac8a2d0ef0689e52fd8535f9635

Generated at Thu Feb 08 08:24:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.