[SERVER-22257] Remove the THP "defrag" startup warning Created: 21/Jan/16  Updated: 23/Feb/23

Status: Backlog
Project: Core Server
Component/s: Diagnostics
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Alexander Komyagin Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-18989 Test for THP is not accurate Closed
Related
is related to SERVER-18989 Test for THP is not accurate Closed
Assigned Teams:
Service Arch
Participants:

 Description   

In recent Linux releases it's enough to turn off THP. Defrag shouldn't occur if THP is disabled. (rumor has it that it wasn't completely true in old Linux kernels)

Acceptance Criteria: completely remove the aforementioned code (with an option to add some documentation regarding Linux Transparent Huge Page defragmentation and its implications to mongod) or update the code appropriately.



 Comments   
Comment by Daniel Morilha (Inactive) [ 13/Apr/22 ]

According to wikipedia kernel 3.10 ceased by the second half of 2013. According to MongoDB Version history that maps to mongo 3.0 (or before) which makes this ticket less relevant nowadays.

With that said the code as it currently stands looks incomplete if compared with the previous comment or even with the document linked hereto and last updated on May 13, 2017: The current code only looks at the synchronous defragmentation. It also runs the risk of misreporting based on different / future kernel versions.

My suggestion is to, unless these settings are mandatory, deprecate the code and provide the same warnings as some sort of documentation leaving parts of the task to the readers in case they are looking to optimize their deployment. If otherwise these settings are mandatory, then the code should be updated and mongod should gracefully terminate.

Comment by Lauren Lewis (Inactive) [ 09/Nov/21 ]

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Comment by Eric Milkie [ 20/Apr/18 ]

Transferring to platform team, as SERVER-18989 is related to this and is already on Platform.

Comment by Martin Bligh [ 24/Feb/16 ]

Seems reasonable though I haven't checked the details. Also need to be careful of what happens in case they tweak this again in the future I guess.

Comment by Alexander Komyagin [ 24/Feb/16 ]

Or we can just say that for kernels starting with 3.10 for sure

Comment by Martin Bligh [ 23/Feb/16 ]

Sounds like we'd have to work out the exact kernel version where this changed over, including all distro branches, etc, and explain this to the user clearly ... seems like a non-trivial amount of effort

Comment by Alexander Komyagin [ 27/Jan/16 ]

If I'm reading https://www.kernel.org/doc/Documentation/vm/transhuge.txt correctly, transparent_hugepage/defrag is responsible for synchronous defrag during huge page allocation:

It's also possible to limit defrag efforts in the VM to generate
hugepages in case they're not immediately free to madvise regions or
to never try to defrag memory and simply fallback to regular pages
unless hugepages are immediately available.

As such, it shouldn't matter if huge pages are disabled.

khugepaged/defrag is responsible for async defrag:

khugepaged runs usually at low frequency so while one may not want to
invoke defrag algorithms synchronously during the page faults, it
should be worth invoking defrag at least in khugepaged.

But khugepaged shouldn't be running if huge pages are disabled:

khugepaged will be automatically started when
transparent_hugepage/enabled is set to "always" or "madvise, and it'll
be automatically shutdown if it's set to "never".

Generated at Thu Feb 08 03:59:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.