[SERVER-10443] Compact command with LOW priority Created: 06/Aug/13  Updated: 06/Dec/22  Resolved: 04/Mar/19

Status: Closed
Project: Core Server
Component/s: Storage, Usability
Affects Version/s: 2.4.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin J. Rice Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 2
Labels: compaction, indexing, performance, sharding, storage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-9559 Docs for SERVER-10443: Compact comman... Closed
Related
is related to SERVER-1256 low priority write flag Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

The 'compact' command should be runnable at low priority on any system without affecting existing performance.

Current 'compact' command restrictions:

  • only runs replicas by default, unless forced;
  • locks entire collection, preventing any activity until finished.

This case covers the trivial and straightforward case of regular maintenance. That is, the compact command should be runnable at any time, or set up as a background process. It should scan through all chunks, one at a time. The least recently used chunk should be addressed first.

Processing:

  • The least recently used chunk should be mapped in and paged in to a special holding area, and actions performed on this chunk should be -
  • all documents sorted in primary/shard-key order;
  • documents snugged up against one another or spaced out, per requirements specified in compact command's padding factors;
  • OPTIONAL: As a further step, if any of this document's fields are indexed, those index entries should be verified and possibly corrected.
  • Processed chunk identifiers should be stored so the same chunks are not repeatedly processed in any n-hour period.

This could be done without affecting performance by querying the locking percentage on the individual shards once per minute, and if the lock percentage is higher than 80%, pause compaction I/O. Alternatively, looking up IOSTAT's IO utilitization percentage will give an idea if the chunk's location can handle more IO.

An additional parameter could be a rate limit on the number of chunks processed per minute/hour, or a max percentage of available IO to use.



 Comments   
Comment by Kevin J. Rice [ 12/Aug/13 ]

I would like to revise this case slightly.

Base problem(s) to solve:

  • We cannot compact a database that's in use without the somewhat-major action of stepping down the primaries.
  • We cannot run compaction when there is a large constant load on Mongo, either.

Optional work associated with this case: data verification.

  • Verify that documents are sorted into shard-key order within each chunk, so as to speed disk IO and reduce IOPS for batch jobs that update each document in shard-key order.
  • Verify indexes contain correct references to documents. Might be too hard to coordinate this activity and not be worth it, thus it's optional.

I'd like to modify above comments to say it doesn't matter how this compaction is done, in what order, as long as it's done with a low/background priority.

Comment by Kevin J. Rice [ 12/Aug/13 ]

Note that the state of this process could be kept in a single variable containing the shard key we're concerned with. This state variable should be re-read at the end of processing every chunk, so no restart of mongos/mongod/etc. is required in order to make it work differently.

OPTIONAL FOLLOW-ON CASE: Perform Deferred splitChunks.

If the chunk is oversized, it should be split as it would have been normally. This is only a consideration because it is possible during high-load situations (e.g., mongorestore) to have a chunk that is larger than the standard 64 MB. It would normally require the chunk to be written to in order for the split-chunk logic to be called, but if we have the chunk in memory, we might as well do any splitchunks required.

OPTIONAL FOLLOW-ON CASE: Like the balancer, this compact command could be turned on and off for a set of time periods (so we could run it during specific overnight hours only).

Generated at Thu Feb 08 03:23:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.