[SERVER-30426] dropDatabase very slow due to repeated calls to closeAllCursors Created: 31/Jul/17 Updated: 30/Oct/23 Resolved: 21/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 3.2.15, 3.4.6 |
| Fix Version/s: | 4.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Fixed | Votes: | 29 |
| Labels: | 3.7BackgroundTask | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Description |
|
Example:
Collecting perf data shows mongod using 100% of a CPU for the duration, all in this stack:
For each of the 200 WT tables dropped we call closeAllCursors for that table. The perf stacks show all the time is spent in closeAllCursors itself, and the ftdc metrics also show that we aren't actually closing any cursors, so it appears that all the time is spent by closeAllCursors scanning the list of 325k open cursors to find cursors with a matching table uri. |
| Comments |
| Comment by Eric Milkie [ 23/May/18 ] |
|
No, it takes effect immediately after the next cursor operation completes. |
| Comment by Michael [ 23/May/18 ] |
|
Does the mongo server need to be restarted in order for the new parameter value to take effect? |
| Comment by Eric Milkie [ 21/May/18 ] |
|
After |
| Comment by Eric Milkie [ 21/May/18 ] |
|
In the linked server ticket |
| Comment by Michael [ 21/May/18 ] |
@milkie, what is the parameter? |
| Comment by Eric Milkie [ 25/Oct/17 ] |
|
I forgot to mention, the server parameter could be backported to 3.4, but the new drop behavior will not be. |
| Comment by Eric Milkie [ 25/Oct/17 ] |
|
We're actively investigating a temporary fix for this issue that would involve a server parameter to optionally turn off cursor caching. This would result in a performance drop of up to 10%, but would completely avoid the drop issues and CPU consumption during collection and database drops. In addition, in version 3.6, we have enhanced the way dropDatabase works so that it effectively drops all collections in the database first, without holding the global lock, before dropping the now-empty database. This won't affect the CPU consumption but it will improve server availability by significantly reducing the amount of time the global lock is held. |
| Comment by Alexander Gorrod [ 23/Oct/17 ] |
|
The discussion in |
| Comment by Aaron Queen [ 20/Oct/17 ] |
|
I believe we're experiencing this on 3.4.9 as well. db.dropDatabase() on a roughly 2mb db takes minutes and essentially brings all dbs to a halt. Server has two processors, and we see close to 100% cpu on a single cpu where it just seems like something is spinning forever. I'd like to know if this will be fixed soon and backported to 3.4 as it affects our operations when dropping databases. Is the proposed workaround for now to drop each collection individually, then the empty db afterward? |
| Comment by David Henderson [ 22/Sep/17 ] |
|
Any progress on this issue - is it likely to hit a 3.4 release in the near future? |
| Comment by Bruce Lucas (Inactive) [ 30/Aug/17 ] |
|
This problem was introduced by |
| Comment by Richard Yates [ 31/Jul/17 ] |
|
Gotcha, thanks. I thought for second we had a serious open cursor issue! Let me/David know if we can be of further assistance with this one. – Richard. |
| Comment by Bruce Lucas (Inactive) [ 31/Jul/17 ] |
|
The 325k cursors here are internal: mongod maintains a cache of WT cursors to reduce the cost involved in creating and destroying WT cursors during normal operation. These cursors are WT cursors and are not the same thing as mongod client cursors, which is the number you are looking at. The size of that that cache grows to depends on factors like the number of open connections and the number of concurrent collections in use in the past, so it is expected that this number could be quite a bit larger than the number of mongodb client cursors currently open. |
| Comment by Richard Yates [ 31/Jul/17 ] |
|
Hi Bruce, A quick question if I may. The 325k open cursors, are those internal cursors or actual cursors from connections? We use mongodb cloud manager and it's saying the perticular server only has 2 open cursors so we're a little confused. – Richard. |