-
Type: Improvement
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
13
-
Not Needed
Summary
MongoDB regularly collects BLOCK_REUSE_BYTES for every table in an attempt to accurately represent the amount of disk usage by collections and indexes. This is ultimately used as an attempt to bill customers fairly.
This metric opens a dhandle if it is not already open, which takes the dhandle lock. Querying this metric also takes the schema lock. Unfortunately, this had significant performance consequences because we collect this metric every few seconds for all collections and indexes in MongoDB. We had to disable this behavior by default in SERVER-62277.
Additionally, the activity of scanning through the disk blocks for large tables has outsized IO costs.
Motivation
This is severely affecting the performance of certain shared MongoDB Cloud instances. The metric is required to be queried on a frequent periodic basis for all tables in order to bill customers accurately.
- How likely is it that this use case or problem will occur?
Any MongoDB installation with many collections and indexes.
- If the problem does occur, what are the consequences and how severe are they?
Read/write unavailability
- Is this issue urgent?
Somewhat. We need to come up with a solution somewhere between the WiredTiger and MongoDB layers to report this information in a cheaper way.
Acceptance Criteria (Definition of Done)
Expose this statistic for tables in a less impactful way.
- Testing
Should ensure that this statistic does not take locks and open idle dhandles.
- Documentation update
Probably should explain behavior of this statistic.
- is related to
-
WT-8582 Expand extent lists to collect GC information
- Open
- related to
-
SERVER-62277 Performance regression from dbstats due to occupied disk space calculation
- Closed
- links to