[SERVER-34713] Progressively declining dropDatabase performance Created: 27/Apr/18 Updated: 29/Oct/23 Resolved: 09/Jun/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | 3.6.3, 3.6.4 |
| Fix Version/s: | 3.6.9, 4.0.1, 4.1.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Geert Bosch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||||||||||
| Sprint: | Storage NYC 2018-05-21, Storage NYC 2018-06-04, Storage NYC 2018-06-18 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
Time required for dropDatabase progressively increases on each iteration, for example, with a 1-node replica set:
The performance also starts off worse than 3.4.10. Does not reproduce standalone. |
| Comments |
| Comment by Eric Milkie [ 02/Nov/18 ] |
|
Sorry for the delay; the code freeze for 3.6.9-rc0 is now scheduled for Monday Nov. 5, so it shouldn't be long afterwards for the production release of 3.6.9. |
| Comment by Michael [ 01/Nov/18 ] |
|
@ramon.fernandez, I know this has already merged into the 3.6 branch, but October has passed and there was no 3.6.9 release. Any estimates on when the next 3.6.9 release will happen? |
| Comment by Githook User [ 24/Sep/18 ] |
|
Author: {'name': 'Geert Bosch', 'email': 'geert@mongodb.com', 'username': 'GeertBosch'}Message: (cherry picked from commit c7451c0e11c2a782e9c0dabe16cbad744e4c451a) Conflicts: WiredTigerBeginTxnBlock cherry picked from commit 6d2de545a7cfcf4ab23dcf73426a1d50896d6d0c |
| Comment by Ramon Fernandez Marina [ 27/Aug/18 ] |
|
Apologies for the late reply mmillerick; the work to back port this fix to 3.6 is currently scheduled and I'd expect it to become available some time in October if there are no unexpected issues, but unfortunately I can't provide a more accurate estimate. Please note that the fix is in 4.0, so you can always consider an upgrade if this issue is problematic for you. Regards, |
| Comment by Michael [ 16/Aug/18 ] |
|
Is there an estimate for when this will be patched in 3.6? |
| Comment by Githook User [ 06/Jul/18 ] |
|
Author: {'username': 'GeertBosch', 'name': 'Geert Bosch', 'email': 'geert@mongodb.com'}Message: (cherry picked from commit c7451c0e11c2a782e9c0dabe16cbad744e4c451a) Conflicts: |
| Comment by Githook User [ 09/Jun/18 ] |
|
Author: {'username': 'GeertBosch', 'name': 'Geert Bosch', 'email': 'geert@mongodb.com'}Message: |
| Comment by Bruce Lucas (Inactive) [ 27/Apr/18 ] |
|
I opened |
| Comment by Judah Schvimer [ 27/Apr/18 ] |
|
I can't think of any replication related changes between 3.6 and 3.7 that would make the first dropDatabase take longer. That sounds like an easy perf workload to add and profile. As for the fact that they keep taking longer, I don't think two phase drop can explain it. dropDatabase waits for all collection drops to replication-commit and then does the physical database drop before returning success. There shouldn't be any data left of the database when it returns and the replication lag should be 0. I wonder if there are any storage resources that aren't getting cleaned up properly? FTDC data might provide some insight there. |
| Comment by Andy Schwerin [ 27/Apr/18 ] |
I'm not sure. Perhaps benety.goh or judah.schvimer can answer that question. |
| Comment by Bruce Lucas (Inactive) [ 27/Apr/18 ] |
|
Does that theory apply to the 1-node replica set used for the chart above? |
| Comment by Geert Bosch [ 27/Apr/18 ] |
|
This behavior of 3.6 vs 3.4 can be explained by an ever increasing number of collections, because the second phase of the drop cannot keep up with the rate of drops. As the secondary lags a bit, the number of collections can increase. I'll investigate your reproducer and see if this hypothesis holds. -Geert |
| Comment by Bruce Lucas (Inactive) [ 27/Apr/18 ] |
|
schwerin, 3.7.5 shows a similar increase (with perhaps a smaller coefficient) but about a 2x or so larger starting value than 3.6.4 (and a good bit more variability) - is this also expected or is a separate ticket for that warranted? |
| Comment by Andy Schwerin [ 27/Apr/18 ] |
|
The slower initial performance is expected, because of changes made to make repl rollback robust to drop and dropDatabase. (Two phase drop). |