[SERVER-30689] Add command to flush the WiredTiger cache Created: 16/Aug/17 Updated: 09/Jan/24 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor - P4 |
| Reporter: | Dmitry Agranat | Assignee: | Alexander Gorrod |
| Resolution: | Unresolved | Votes: | 3 |
| Labels: | refinement | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Story Points: | 0 | ||||||||
| Description |
|
Many developers resort to bouncing the MongoDB instance between test runs in order to clear the WiredTiger cache. While this method will accomplish the goal of purging the cache, it is extremely inefficient. In addition, in some organizations, developers do not have the required grants to restart the server. |
| Comments |
| Comment by Alexander Gorrod [ 09/Jan/24 ] |
|
Sorry for the delayed response. This feature request is languishing because it represents a significant amount of engineering work and the use cases we've had reported generally won't be well served by the suggested change. From what I have seen, the primary goal is to reduce cycle time for testing and benchmarking environments. In such environments, reducing the amount of work done from restarting MongoDB to flushing content in the active cache is unlikely to achieve the desired outcome of a "clean slate" from which to run another iteration of testing/benchmarking. There are many holders of resources (including memory) outside of the WiredTiger cache, and those resources are likely to be meaningful to test and benchmark results. I'm hesitant to close out the ticket, since it is a valid feature request, and having an open Jira ticket gives a good place to find some context and get answers to questions about the functionality. If I have misunderstood a use case or benefit, please let me know, and we can explore enhancements that will improve the behavior of MongoDB. |
| Comment by Dhananjay Ghevde [ 28/Jul/23 ] |
|
I have a customer who is interested in functionality. can we have an update on this? Let me know who should I follow up with if this is not the correct forum. |
| Comment by Alexander Gorrod [ 25/Feb/22 ] |
|
It feels like there might be some genuine new use cases here, I'm going to assign this to the right backlog instead of me. The Storage Engines team will be happy to chat about what use cases are being addressed here, and what we can do in WiredTiger to make them possible. |
| Comment by Eric Milkie [ 20/Jan/22 ] |
|
Now that we have table-import capability in WiredTiger, we might be able to implement this ticket for specific tables by doing the following in the MongoDB layer: |
| Comment by Mathias Stearn [ 20/Jan/22 ] |
|
alexander.gorrod, I'm hoping that you will reconsider this ticket with some additional information that should address your concerns. Of course, if the implementation complexity outweighs the benefit, that is another issue, but that didn't seem like it was your primary objection.
I know that you are concerned about misuse, but at the mongo layer we already have mechanisms to deal with this. The most important is that some commands can be tagged as "test only" so they are only available when the server is started with a special flag to indicate that it is in test mode. This includes things like setting failpoints where the whole purpose is to "break" the behavior of the system in order to test out hard to reach edge cases.
Not that it should be available in production anyway (see above), but I assume that this command would require admin permisions, just like the shutdown command (which is different from permission to restart the server, since that requires system permissions outside of mongodb). As a possible exception, if we allowed purges at the level of tables, we could lower the permissions to be similar to dropping that table. I don't think you would object to someone who is allowed to shutdown the server or drop a table being able to purge its cache.
We actually dont want those optimizations when benchmarking. We already have a problem with our benchmarks reflecting an "overly clean slate" of running, which is not the state we care about (with obvious exceptions like time to bring up a new host, or responsiveness on failover). In general, we really care about the steady-state performance of a system that has been up for a while. While purging the cache isn't exactly the same, it is closer than when doing a clean start. It would really be ideal if we could easily emulate the cache being filled with "junk" than needs to be purged on demand while the test is running. But I assume that would be much more complicated to implement |
| Comment by Alexander Gorrod [ 28/Aug/17 ] |
|
dmitry.agranat This functionality adds very little value to users of MongoDB and would expose functionality that could easily cause major service disruption if mis-used. I don't think the value added by implementing this functionality outweighs the consequences of potential API misuse, so I'm going to close this ticket as "Won't fix". In response to the particular use case in the ticket:
There is no evidence that bounding the MongoDB instance would be less efficient than purging the cache. In fact there are some optimizations when shutting down MongoDB that mean it could be more efficient than a cache purge.
A user who does not have permission to restart the server should also not have permissions to purge the WiredTiger cache on a server. |
| Comment by Bruce Lucas (Inactive) [ 20/Aug/17 ] |
|
I wonder if flushing the cache is actually faster than restarting mongod, particularly for large caches - flushing the cache requires freeing a lot of small data structures. Also this won't necessarily leave the allocator in the same state it is in after a restart, and that potentially has an impact on performance. |
| Comment by Asya Kamsky [ 19/Aug/17 ] |
|
What is the thinking behind clearing the cache being more efficient? To prevent the cache from being "dirtied" while it's being flushed to disk, wouldn't the writes have to be stopped - basically making this not any more efficient than a simple re-start? |