[SERVER-10443] Compact command with LOW priority Created: 06/Aug/13 Updated: 06/Dec/22 Resolved: 04/Mar/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage, Usability |
| Affects Version/s: | 2.4.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Kevin J. Rice | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Done | Votes: | 2 |
| Labels: | compaction, indexing, performance, sharding, storage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
The 'compact' command should be runnable at low priority on any system without affecting existing performance. Current 'compact' command restrictions:
This case covers the trivial and straightforward case of regular maintenance. That is, the compact command should be runnable at any time, or set up as a background process. It should scan through all chunks, one at a time. The least recently used chunk should be addressed first. Processing:
This could be done without affecting performance by querying the locking percentage on the individual shards once per minute, and if the lock percentage is higher than 80%, pause compaction I/O. Alternatively, looking up IOSTAT's IO utilitization percentage will give an idea if the chunk's location can handle more IO. An additional parameter could be a rate limit on the number of chunks processed per minute/hour, or a max percentage of available IO to use. |
| Comments |
| Comment by Kevin J. Rice [ 12/Aug/13 ] |
|
I would like to revise this case slightly. Base problem(s) to solve:
Optional work associated with this case: data verification.
I'd like to modify above comments to say it doesn't matter how this compaction is done, in what order, as long as it's done with a low/background priority. |
| Comment by Kevin J. Rice [ 12/Aug/13 ] |
|
Note that the state of this process could be kept in a single variable containing the shard key we're concerned with. This state variable should be re-read at the end of processing every chunk, so no restart of mongos/mongod/etc. is required in order to make it work differently. OPTIONAL FOLLOW-ON CASE: Perform Deferred splitChunks. If the chunk is oversized, it should be split as it would have been normally. This is only a consideration because it is possible during high-load situations (e.g., mongorestore) to have a chunk that is larger than the standard 64 MB. It would normally require the chunk to be written to in order for the split-chunk logic to be called, but if we have the chunk in memory, we might as well do any splitchunks required. OPTIONAL FOLLOW-ON CASE: Like the balancer, this compact command could be turned on and off for a set of time periods (so we could run it during specific overnight hours only). |