[SERVER-82551] Use parallel compressor to speedup binaries archival Created: 29/Oct/23 Updated: 12/Nov/23 Resolved: 02/Nov/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.2.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Tommaso Tocci | Assignee: | Tommaso Tocci |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Assigned Teams: |
Server Development Platform
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Sprint: | Dev Tools 2020-04-06 | ||||
| Participants: | |||||
| Description |
SummaryUsing pigz parallel compressor to create binary tarball would reduce archive_dist_test_debug task runtime from ~14 min to ~5 min Long descriptionThe majority of evergreen tasks will run only after mongo binaries have been compiled, compressed and uploaded to S3. For amazon linux 2 variant these steps will take roughly:
In archive_dist_test_debug is mostly composed by two parts: The compression is performed using the following tar command:
tar by default use single thread compression algorithm, this means that we are using only 1 out of the 16 cores available (we currently use amazon2-arm64-large for this task). It is possible to simply tell tar command to use the parallel compessor pigz to make use of all the available core. A quick experiment showed how using pigz will reduce the tar command execution time from 9.22 min to 35 seconds. |
| Comments |
| Comment by Alex Neben [ 12/Nov/23 ] |
|
From my quick playing around with things this was the best query I could come up with. I just checked today and it doesn't seem to be making a big dent so my feeling is that we do not need to backport this. |
| Comment by Tommaso Tocci [ 06/Nov/23 ] |
|
alex.neben@mongodb.com is your query monitoring archive_dist_test_debug task specifically ? I think we won't see much improvements in the other `scons compile` steps if there are no binaries to archive. |
| Comment by Alex Neben [ 05/Nov/23 ] |
|
I made this query that does a (hopefully) good job at teasing out your change to see if it has a major impact overall. If the answer is yes then 2. We can decide on a backport based on these numbers. Either way this is an awesome change! Great work! |
| Comment by Githook User [ 02/Nov/23 ] |
|
Author: {'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@mongodb.com', 'username': 'toto-dev'}Message: |
| Comment by Alex Neben [ 31/Oct/23 ] |
|
I think whoever is on triage rotation this (or next) week should try to take this on. This is a huge win! tommaso.tocci@mongodb.com if you are already working on it I don't want to steal it from you so just communicate on this ticket if you want to take it over the finish line or not. |
| Comment by Trevor Guidry [ 30/Oct/23 ] |
|
tommaso.tocci@mongodb.com Alright, I will assume the python package is just unrelated and broken. Thanks for the answer. |
| Comment by Tommaso Tocci [ 30/Oct/23 ] |
|
trevor.guidry@mongodb.com pigz v2.3 was released 10 years ago 10 years ago and nowadays is shipped as standard package in many linux distros. I don't know for sure if pigz works correctly but I would assume so |
| Comment by Trevor Guidry [ 30/Oct/23 ] |
|
Is pigz reliable? I tried using a pigz python library before https://github.com/bguise987/pigz-python and some of the files that were compressed by it were corrupted. It is possible the python library is implemented incorrectly and the pigz binary is fine. |