[SERVER-75052] 5.0 shard replica startup takes much longer than 4.x due to WT checkpoint Created: 20/Mar/23 Updated: 08/Sep/23 Resolved: 08/Sep/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 5.0.15 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ian Springer | Assignee: | Yuan Fang |
| Resolution: | Done | Votes: | 0 |
| Labels: | regression, slow, startup | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Steps To Reproduce: | Run a 5.0 sharded cluster with many databases and collections, and heavy read and write volume. |
| Participants: |
| Description |
|
In 4.x, out shard replicas consistently took around 2 minutes to start up. In 5.0, they are taking anywhere from 3.5 minutes to 12 minutes to start up. The additional time to start is primarily consumed by a WT checkpoint. It appears that 5.0 performs a checkpoint as part of the startup process, whereas 4.x waited until after startup was complete to takes the first checkpoint.
Here's an abridged example of a 4.2 startup:
Notice, the startup takes less than 2 minutes, and the first checkpoint runs a couple minutes later.
And here's an abridged example of a 5.0 startup on the exact same machine:
Notice, the startup takes over 8 minutes, and over 7 minutes of that is spent doing a checkpoint. 8-12 minute startup time results in 10-14 minute restart time, which is unacceptable. It makes rolling restarts for upgrades and other maintenance take 5-7 times longer. And even more importantly, it prevents us from expediently dealing with cluster availability and performance issues. Is it viable to switch back to 4.x behavior of waiting until startup is complete to perform the first wT checkpoint?
|
| Comments |
| Comment by Yuan Fang [ 08/Sep/23 ] |
|
We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket. |
| Comment by Yuan Fang [ 15/Aug/23 ] |
|
Hi ian.springer@salesforce.com, we still need additional information to diagnose the problem. If this is still an issue for you, would you please upload the FTDC and mongod logs that cover the incident of the lengthy startup to this new upload portal? Thank you. |
| Comment by Yuan Fang [ 19/Jul/23 ] |
|
Hi ian.springer@salesforce.com, I apologize for the delay in getting back to you. I understand that you were unable to upload diagnostic data for version 4.2. If the issue of long startup persists in version 5.0, could you please upload the FTDC and mongod logs that cover the incident of the lengthy startup? Since the previous upload portal has expired, I have created a new upload portal for you. Thank you very much for your understanding and support. Regards, |
| Comment by Ian Springer [ 20/Mar/23 ] |
|
Hi Yuan,
It is true that the checkpoints are taking longer on 5.0 (and the level of activity for the cluster has increased by about 60% from the time of the 4.2 logs to present). However, it appears the reason for much longer mongod startup times in 5.0 is that the initial checkpoint is being run as part of the startup, versus post-startup in 4.x.
Regarding uploading of logs, we would need to redact all customer data from the logs, which would be a huge task, since the logs contain actual queries, as well as database and collection names, which contain the names of our customers. Do you happen to have a utility for redacting all such information from Mongo logs that I can use to redact the logs prior to uploading them?
It might be easier if you can tell me the specific logs I could provide that would help you diagnose the issue. Can you do that?
Regarding diagnostic metrics, I can upload those for 5.0 if they do not contain any customer data (I will check on that). I will not be able to up upload diagnostic metrics for 4.2, since we only retain diagnostic.data files for a week. I could potentially provide diagnostic metrics from another of our clusters that is still running 4.4, but it would then be an apples to oranges comparison due to the clusters having different # of database and collections, overall data sizes, and shard counts.
Thanks! Ian |
| Comment by Yuan Fang [ 20/Mar/23 ] |
|
Hi ian.springer@salesforce.com, Thank you for your report. I understand that you observed longer-running checkpoints at a v5.0 startup compared to a v4.2 startup. I have observed similar issues in the past that were caused by changes in the workload. To determine the reason for the longer checkpoints in v5.0 and answer your question at the end, we require additional information. I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time. For the node spanning a time period that includes the incidents:
would you please archive (tar or zip) and upload to that link:
Regards,
|