MongoDB regularly collects BLOCK_REUSE_BYTES for every table in an attempt to accurately represent the amount of disk usage by collections and indexes. This is ultimately used as an attempt to bill customers fairly.
This metric opens a dhandle if it is not already open, which takes the dhandle lock. Querying this metric also takes the schema lock. Unfortunately, this had significant performance consequences because we collect this metric every few seconds for all collections and indexes in MongoDB. We had to disable this behavior by default in
Additionally, the activity of scanning through the disk blocks for large tables has outsized IO costs.
This is severely affecting the performance of certain shared MongoDB Cloud instances. The metric is required to be queried on a frequent periodic basis for all tables in order to bill customers accurately.
- How likely is it that this use case or problem will occur?
Any MongoDB installation with many collections and indexes.
- If the problem does occur, what are the consequences and how severe are they?
- Is this issue urgent?
Somewhat. We need to come up with a solution somewhere between the WiredTiger and MongoDB layers to report this information in a cheaper way.
Acceptance Criteria (Definition of Done)
Expose this statistic for tables in a less impactful way.
Should ensure that this statistic does not take locks and open idle dhandles.
- Documentation update
Probably should explain behavior of this statistic.