[SERVER-48522] Regression after mass deletion Created: 01/Jun/20 Updated: 28/Apr/23 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.0-rc7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Sid Mahajan |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | KP44, dmd-escapes, perf-escapes, refinement | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Sprint: | Storage - Ra 2022-01-24, Storage - Ra 2022-02-07, Storage - Ra 2022-02-21, Storage - Ra 2022-03-07, Storage - Ra 2022-03-21, Storage - Ra 2022-06-13 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Story Points: | 0 | ||||||||||||||||||||||||||||
| Description |
|
Test inserts 100 k documents, removes them all, then looks for (the now non-existent) documents via the _id index. Left is 4.2.6, right is 4.4.0-rc7
PMP shows most of our time here
This one also showed up once
Repro: 3-node replica set
|
| Comments |
| Comment by Sid Mahajan [ 01/Feb/23 ] | ||||||||||||||||||
|
Fixing the root cause found for the Summary of The ticket concerns the slow search_near performance that is caused by a high number of deleted items and large cache size. Lowering the cache size was found to greatly improve the performance, but further investigation uncovered that the pages with many deleted items were not being evicted. The issue was that these pages were not being added to the urgent eviction queue as they were already in the ordinary queues, however, the eviction threads found the ordinary queues empty. This led to the persistence of deleted pages in the cache, hindering the search_near performance. To resolve the problem, the solution is to remove the relevant (non-candidate) pages from the queue, thereby improving the search_near performance. FTDC analysis before and after Before:
After:
Thanks, etienne.petrel@mongodb.com for running the tests. The opcounters query rate has noticeably increased after the implementation of the changes in | ||||||||||||||||||
| Comment by Will Korteland [ 15/Mar/22 ] | ||||||||||||||||||
|
> Why are you making the assumption that WT is not the cause? I only assumed this for the find regression, since your earlier bisecting showed that came from MDB commit `06a03bc6ec`. > Did you do this? Yep, hence the comment that there's still a large regression here - the same 32% (roughly) that was originally observed, so it's definitely something to do with exhaust cursors. > Back in Oct-2021 Alexander Gorrod moved this from replication to storage engines because: I think there's some misunderstanding around what "this" is. My comment on March 8 refers solely to the find regression, but Alex moved the delete regression since that's definitely a WT issue (due to the addition of durable history). | ||||||||||||||||||
| Comment by Will Korteland [ 08/Mar/22 ] | ||||||||||||||||||
|
An update on the read performance issue. Even with very recent MDB + WT, there's still a large regression here. Based on the above discussion, I started investigating with the assumption that WT was not the cause, so I went looking through MDB server code for what actually changes when exhaust cursors are turned on. I don't know the server code at all, so I limited it to looking at all the places that mention exhaust cursors, but there were still a lot of these - I found three places where the performance hit might be happening. The first is in bgsync.cpp, where there's an optimisation to avoid writing ops one at a time. I'm not sure if this applies for the regression we're looking at, but it's interesting that exhaust cursors are excluded. This might be fine, I'm just pointing it out since the motivation there isn't obvious to me:
The second is in strategy.cpp, where we have some code that parses a reply and creates a response. Again, this might be really cheap but it's extra work that isn't being done for non-exhaust cursors. If I knew the server code better I might be able to exclude this.
Finally, service_state_machine.cpp has a `makeExhaustMessage` function that seems to do quite a lot of work, including `OpMsg::appendChecksum(&exhaustMessage)`, which is probably expensive for longer messages. It's unclear how often this is called though. | ||||||||||||||||||
| Comment by Will Korteland [ 28/Feb/22 ] | ||||||||||||||||||
|
I'm picking this up after Monica and Alexey had to move on to something else, so here's my understanding of the situation to make sure I'm up to speed. There are two perf regressions here, both happened at around the same time. One turned out to be MDB (the "find" regression), another turned out to be WT (the "remove" regression). The "find" regression went away when testing WT4.4 with MDB 4.2, but the remove regression didn't. One thing that I don't think has been explained yet is that, for this workload, we kind of expect to see the "remove" regression. The WT version that MDB 4.4 uses introduced durable history, and we're hitting something of a pathological case here with the "insert then immediately remove" pattern. After durable history, we keep these records in the data store for a few seconds, so now when we search for the non-existent entry, we actually have to do some work - we can't "short circuit" the search like we did previously. (I think this "few seconds" might be changing to "30 seconds" in an upcoming WT version, but don't quote me on that). That said, this is just an explanation of what's happening; it might have changed since Ger bisected the "find" regression to the MDB commit that started using exhaust cursors - I suspect that one's out of WT's hands since it's asking us to do more stuff. Confirming that is the next step for me - I plan to take a look at exactly what changes when using the exhaust cursor. | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 17/Feb/22 ] | ||||||||||||||||||
|
monica.ng if I'm not mistaken ScopedThread was renamed to Thread somewhere along the way. | ||||||||||||||||||
| Comment by Monica Ng [ 17/Feb/22 ] | ||||||||||||||||||
|
bruce.lucas dmitry.agranat The original reproducer uses the ScopedThread class but this is not supported in the MongoDB shell anymore, is there any equivalent class we can use instead? I have a reproducer script in Python but I'm not sure it is doing exactly the same thing. | ||||||||||||||||||
| Comment by Lingzhi Deng [ 09/Feb/22 ] | ||||||||||||||||||
|
monica.ng I haven't conducted the same investigation here. But yes, all you need to do is to turn oplogFetcherUsesExhaust off. | ||||||||||||||||||
| Comment by Monica Ng [ 09/Feb/22 ] | ||||||||||||||||||
|
lingzhi.deng I saw your comment here: https://jira.mongodb.org/browse/SERVER-48522?focusedCommentId=3700654&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-3700654 and wasn't sure if any investigation was conducted into this. If not, I would like to go down the path of examining the performance trends without the use of exhaust cursors for oplog fetching so that I can potentially isolate any regressions related to WiredTiger specifically. Is there a way to disable the use of exhaust cursors for oplog fetching? Can it be turned off by setting the "oplogFetcherUsesExhaust" parameter? Is there anything else I should be aware of? | ||||||||||||||||||
| Comment by Monica Ng [ 07/Feb/22 ] | ||||||||||||||||||
|
alexander.gorrod I've tried running the reproducer against the recent version of MongoDB which would include WiredTiger changes like | ||||||||||||||||||
| Comment by Monica Ng [ 28/Jan/22 ] | ||||||||||||||||||
|
I had a look into this and wanted to check if we were seeing similar trends in the metrics when running the reproducer against a newer version of MongoDB. I am still seeing a drop in the
We noticed that the
| ||||||||||||||||||
| Comment by Alexander Gorrod [ 13/Oct/21 ] | ||||||||||||||||||
|
Storage Engines team is going to pick this up and run the reproducer against the latest version of WiredTiger. We did some work in If there is still an unreasonable performance regression, we will do some analysis to find the right owner. | ||||||||||||||||||
| Comment by Kelsey Schubert [ 14/Dec/20 ] | ||||||||||||||||||
|
Discussing with Bruce, I think that this would be a good ticket for the performance team to pick up to write a regression test and track down the remaining regression. | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 11/Jun/20 ] | ||||||||||||||||||
|
Thanks sulabh.mahajan. I've confirmed that, using the same shell in both cases, rc9 and 4.2.6 now show the same performance on the finds. However I'm still seeing a regression in the removes. However it looks like WT is being asked to do more work for the same number of documents removed. Left is 4.2.6 and right is 4.4.0-rc9:
"cursor remove calls" is the actual work being done, and we're doing it at a slower rate in 4.4. But in addition to the actual work, we are doing a lot more cursor searches, wt transactions, wt set timestamps, etc., even though we are doing the actual work at a slower rate. As you suggest this means something has changed in the upper layers, but I'm not sure at this point whether it is a change in the integration layer or in the query layer, so I'll investigate some more before reassigning. | ||||||||||||||||||
| Comment by Sulabh Mahajan [ 10/Jun/20 ] | ||||||||||||||||||
|
bruce.lucas, my conclusion is that the latest 4.4 (rc8) shows much lesser regression than the initially reported values. Seems like | ||||||||||||||||||
| Comment by Sulabh Mahajan [ 10/Jun/20 ] | ||||||||||||||||||
|
Update:
426 = MongoDB 4.2.6
All three versions show similar remove calls, with a slightly higher peak for 4.4.0. This could be because of the short burst nature of the removals, and possibly
4.2.6 gets a consistent query rate of 2800 op/s I think there is something in MongoDB outside WiredTiger that might be restricting queries at 1900 op/s. I tried to look for reasons inside WiredTiger that could explain a drop from 2800 op/s to 1900 op/s for 4.4.0 but couldn't locate any. | ||||||||||||||||||
| Comment by Sulabh Mahajan [ 09/Jun/20 ] | ||||||||||||||||||
|
dmitry.agranat, I have not identified any ticket in specific. | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 06/Jun/20 ] | ||||||||||||||||||
|
I'm not sure - sounds promising, but I don't think the fixes you mentioned are in an rc yet. Here's what I'm seeing on (left to right) 4.2.6, rc7, rc8:
So on rc8 the query rate is much improved but still considerably lower than 4.2.6, and the remove rate still shows the same regression. I'll check again on rc9. | ||||||||||||||||||
| Comment by Sulabh Mahajan [ 05/Jun/20 ] | ||||||||||||||||||
|
bruce.lucas, this ticket reduces to focusing on why the query rate is lower in 4.4.0. Do you agree? Other parameters seem comparable to me. | ||||||||||||||||||
| Comment by Sulabh Mahajan [ 04/Jun/20 ] | ||||||||||||||||||
|
I ran the reproducer script on a 3 node replica set, on: The eviction state is much better, with reduced forced eviction because of the deleted count. This is likely fixed from
The cursor remove calls appear to be comparable to 4.2.6
On zooming closer, here are some observations:
| ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 02/Jun/20 ] | ||||||||||||||||||
|
Couple of observations:
So it appears this has something to do with the handling of timestamps. | ||||||||||||||||||
| Comment by Brian Lane [ 02/Jun/20 ] | ||||||||||||||||||
|
Hey alexander.gorrod - yeah, I set it to 4.4.0 and we discussed in planning. We linked a couple of related issues wanting in first then will bring this one in. I put it at that top of the backlog. | ||||||||||||||||||
| Comment by Alexander Gorrod [ 02/Jun/20 ] | ||||||||||||||||||
|
brian.lane did you see this ticket? We should get it prioritized when we can. | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 01/Jun/20 ] | ||||||||||||||||||
|
Thanks for the confirmation. | ||||||||||||||||||
| Comment by Louis Williams [ 01/Jun/20 ] | ||||||||||||||||||
|
bruce.lucas, | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 01/Jun/20 ] | ||||||||||||||||||
|
louis.williams I noticed that |