[SERVER-46805] Validate should limit memory use in its second pass Created: 11/Mar/20  Updated: 06/Nov/23  Resolved: 19/Mar/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.7.0, 4.4.7

Type: Bug Priority: Major - P3
Reporter: Geert Bosch Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Documented
is documented by DOCS-13533 Investigate changes in SERVER-46805: ... Closed
Related
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v4.4
Sprint: Execution Team 2020-03-23
Participants:

 Description   

The validate command runs in two passes. The first pass matches hashes of index key/value pairs generated by the documents against hashes of index key/value pairs appearing in the indexes by incrementing/decrementing counters depending on the hash. If any of the counters is non-zero in the end, there is corruption and a second pass will compute complete sets of extra/missing keys corresponding to each non-zero counter. With lots of corruption, this may require an amount of memory similar to or somewhat greater than the size of all indexes, possibly exhausting memory resources in the server.

Instead, the second pass should make an estimate for the expected memory usage based on the number of non-zero counters, and if this estimate exceeds some predetermined value (either a fixed constant such as 100 MB or 1 GB, or a fraction of available memory) it should warn about potential excessive memory usage and limit itself to collecting keys corresponding to some (maybe just 1 or 2) counters. The validation result should clearly indicate that the results are incomplete and that more corruption exists.



 Comments   
Comment by Vincent Do [ 18/Sep/22 ]

Nvm - I see the linked DOCS ticket. I will re-open those

Comment by Vincent Do [ 18/Sep/22 ]

Can we file a DOCS ticket to update our docs? It current says that this behavior exists only in 5.0+ https://www.mongodb.com/docs/manual/reference/parameters/#mongodb-parameter-param.maxValidateMemoryUsageMB

Comment by Githook User [ 04/Jun/21 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-46805 Limit memory usage in the second phase of validate and add parameter to configure this limit

(cherry picked from commit e398960773331f3b3afe000f2830c84868aaf9e7)
Branch: v4.4
https://github.com/mongodb/mongo/commit/057fff6d4dc4dc8f739abfc1e9a497ad3abb32ce

Comment by Githook User [ 19/Mar/20 ]

Author:

{'name': 'Gregory Noma', 'username': 'gregorynoma', 'email': 'gregory.noma@gmail.com'}

Message: SERVER-46805 Limit memory usage in the second phase of validate and add parameter to configure this limit
Branch: master
https://github.com/mongodb/mongo/commit/e398960773331f3b3afe000f2830c84868aaf9e7

Generated at Thu Feb 08 05:12:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.