[SERVER-82776] fast_archive errors when there is not enough disk space Created: 03/Nov/23  Updated: 11/Jan/24

Status: Needs Scheduling
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Trevor Guidry Assignee: Trevor Guidry
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Build and Correctness OnDeck
Participants:

 Description   

here and here are examples of tasks that fail because there is no space left on the device to extract the core dumps to. 

 

We need to do one of the following things
1. delete files to save space (the task is already over at this point, are there any useless files on disk?)

2. move these tasks to a large distro

3. compress the files to another place that does have space

Currently we rely on the unextracted core dumps to live on the machine at a later step to verify that they come from a "known binary", this might need to be changed if we need to delete them as we go to save space.

 

It also might be good to add a limit to the amount of core dumps that can get uploaded. I can't find the task link anymore but I have seen a task that tried to upload 300+ core dumps because it was a suite with lots of tests and every one of them failed producing core dumps. Adding an arbitrary limit of 50 or so core dumps seems weird but is probably "good enough" for developers to get the information they need. I am not sure if there is a good way to prioritize which core dumps should get uploaded in this case.


Generated at Thu Feb 08 06:50:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.