[SERVER-74038] [Windows] Possible negative performance effects of SetProcessWorkingSetSize in SecureAllocator Created: 15/Feb/23 Updated: 29/Oct/23 Resolved: 27/Feb/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.0.0-rc0, 4.4.20, 5.0.16, 6.0.6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Mark Benvenuto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Server Security
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v6.0, v5.0, v4.4
|
||||||||||||||||
| Sprint: | Security 2023-03-06 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Our SecureAllocator attempts to lock memory pages so that the OS does not page them to disk. In Linux we use mlock, and in Windows we use VirtualLock. Windows imposes limits on the number of pages that can be locked by VirtualLock. From the documentation:
In Even when the pagefile is disabled, we have observed strange performance behavior where Windows moves essentially all of the mongod's memory from the Active state to the Inactive state. This is accompanied by very long stalls. In FTDC, the observed effect is that there is no mongod resident memory. My theory is that SetProcessWorkingSetSize tells the OS that certain memory is important, but by implication that everything outside of the working set is not actually important. As a result of setting the working set to a small value in order to lock pages, the also OS decides to mark the remaining resident memory as "Inactive" even if there is plenty of free memory available in the system. This has serious performance implications that we should investigate and understand. It seems like we either need to disprove this theory, find a different API, or use always use SetProcessWorkingSetSize to ensure that the entire process's memory is important. |
| Comments |
| Comment by Githook User [ 20/Mar/23 ] |
|
Author: {'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}Message: (cherry picked from commit db5ca2947f37d6706c01fe24d6294af75b6418c9) |
| Comment by Githook User [ 20/Mar/23 ] |
|
Author: {'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}Message: (cherry picked from commit db5ca2947f37d6706c01fe24d6294af75b6418c9) |
| Comment by Githook User [ 20/Mar/23 ] |
|
Author: {'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}Message: (cherry picked from commit db5ca2947f37d6706c01fe24d6294af75b6418c9) |
| Comment by Githook User [ 27/Feb/23 ] |
|
Author: {'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}Message: |
| Comment by Mark Benvenuto [ 22/Feb/23 ] |
|
So the issue is that fundamentally that a pair of calls GetProcessWorkingSetSize/SetProcessWorkingSetSize does not do what originally expected. The original fix ( This issue can be easily demonstrated by a custom repro using a unit test (See https://github.com/mongodb/mongo/compare/master...markbenvenuto:mongo:secure_allocator_measure?expand=1). Said repro simply tries to gobble up memory with 1% of it being "secure" memory. By observing memory consumption from GetProcessWorkingSetSize and GetProcessMemoryInfo, we can see how the working set size oscillates as our test program consumes more memory and Windows is periodically told to empty the working set. |