[SERVER-23433] High acquire time for Global Intent Shared lock using WiredTiger Created: 30/Mar/16 Updated: 24/Aug/17 Resolved: 17/May/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.2.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Greg Murphy | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
I'm seeing a number of queries that are experiencing very high execution times, almost entirely due to timeAcquiringMicros against the Global Intent Shared lock. Some examples of the entries from the logs (server, db and collection names anonymised):
We're seeing thousands of these per day, from what I can see always on secondary servers, but in this environment we perform the majority of our queries against the secondaries. (Its an analytics processing platform, so is write-heavy on the master and it doesn't matter if queries are slightly stale) When the slow queries happen, there are often a number of them clustered together during the same or subsequent seconds, which makes me wonder if this is a WiredTiger checkpoint? There are a lot of databases in this replica set, meaning a significant number of files to checkpoint. Environment details: |
| Comments |
| Comment by Kelsey Schubert [ 17/May/16 ] | ||||||
|
Hi gregmurphy, Thank you for the detailed workload description. The issue here is that the checkpoint handle aquire phase will block opening a cursor on any table if it is the first time that table is being opened in a particular WiredTiger session. When there are a smaller number of collections/indexes, MongoDB caches cursors to all tables, and thus operations generally do not need to open tables. However, the size of those caches is limited (to keep them efficient, and avoid memory overhead) which leads to apply ops needing to open tables. We are working on Please feel to vote for Kind regards, | ||||||
| Comment by Greg Murphy [ 18/Apr/16 ] | ||||||
|
Sure - its easier to describe narratively rather than answering the questions in order, but hopefully this covers everything: This particular MongoDB cluster contains analytics data for our multi-tenant platform, with a database per customer (4,728 of those at the moment). Depending on the data generated by that customer, each database can contain up to 55 collections and 58 indexes. The data ingested into the database is generated every 10 minutes and ingested as close to generation as possible. All customers don't continually generate data, so some databases are more active than others. During the ingestion process (which is parallelised with 7 processes running concurrently, each one working on a subset of databases) the current data state is read into memory by the process from the secondary MongoDB nodes. The ingestion process then merges this with the incoming data and persists the results back to MongoDB before performing some aggregations and persisting the results of them. This persistence results in updates/inserts happening on the against the majority of the collections in the database being processed, but as these are aggregated results the number of documents affected is low. New databases, collections and indexes are only created when new customers arrive. In the last 8 hours there have been 58 new databases created, so not a high frequency of object creations. Its difficult to say precisely how many collections and indexes are actively in use at any time as it depends on the processing point that each ingestion process is at. Also end-user queries can hit this cluster (from a web portal). Hope that helps, please let me know if I've missed anything. | ||||||
| Comment by Kelsey Schubert [ 18/Apr/16 ] | ||||||
|
Hi gregmurphy, The number of connection data handles that are currently active (around 200 thousand) is very large is likely explains the issue you are experiencing. When WiredTiger creates a checkpoint it needs to look at each open table (collection or index). To ensure an unchanging set of tables, it takes a lock during this process. Consequently, this process may block other operations including a checkpoint. It is worth noting that the code that gathers these handles was modified in the 3.2.5 release, which explains the difference in the behavior you observed. Please see the attached FTDC graph from your node on MongoDB 3.2.5. We can see that the spikes in read time acquiring and FTDC dropouts are aligned with when checkpoints start.
To continue to investigate, can you please describe some details about your workload? In particular please be sure to answer the following questions:
Thank you, | ||||||
| Comment by Greg Murphy [ 18/Apr/16 ] | ||||||
|
Report attached as requested. For the purpose of this test, we upgraded a single secondary to 3.2.5 again and ran perf against that secondary. | ||||||
| Comment by Kelsey Schubert [ 15/Apr/16 ] | ||||||
|
Hi gregmurphy, Can you please upload the perf report while you are experiencing this issue by following the steps below?
Thank you, | ||||||
| Comment by Greg Murphy [ 15/Apr/16 ] | ||||||
|
Additionally, performance actually seems worse under 3.2.5 - initial analysis of the slow log is showing a more predictable mean of response time of very slow queries (>60s) but a significant increase in their frequency. In all the examples I've looked at almost all of the response time is still waiting for the global lock. Interestingly, average checkpoint time appears to have gone down. We will revert to 3.2.4 to stabilise the system, but I will capture all of the diagnostic.data files from each node beforehand and upload them. Please let me know if there is any further debugging I can enable. | ||||||
| Comment by Greg Murphy [ 14/Apr/16 ] | ||||||
|
OK - the file is uploading to the same secure URL you provided earlier in the ticket. Thanks Greg | ||||||
| Comment by Kelsey Schubert [ 14/Apr/16 ] | ||||||
|
HI gregmurphy, Sorry you are still experiencing issues, please upload the latest diagnostic.data and we will continue to investigate. Thank you, | ||||||
| Comment by Greg Murphy [ 14/Apr/16 ] | ||||||
|
We upgraded one of the secondaries today, but unfortunately are still seeing the same behaviour on it. Examples:
Would the updated diagnostic.data files help? | ||||||
| Comment by Greg Murphy [ 13/Apr/16 ] | ||||||
|
Thanks - much appreciated - we'll upgrade as soon as the release is out and confirm. | ||||||
| Comment by Kelsey Schubert [ 12/Apr/16 ] | ||||||
|
Hi gregmurphy, I have examined the diagnostic data you have provided. You are correct that this issue is correlated to WiredTiger checkpoints. In particular, this issue may be related to a lock WiredTiger takes during checkpoints that will block workloads that create or drop collections or indexes. Please upgrade to MongoDB 3.2.5 when it released and confirm if the issue is resolved. Thank you, | ||||||
| Comment by Greg Murphy [ 31/Mar/16 ] | ||||||
|
Thanks Thomas, I've just uploaded the archives from both secondaries. | ||||||
| Comment by Kelsey Schubert [ 31/Mar/16 ] | ||||||
|
Hi gregmurphy, I have created a secure upload portal for you to use. Files you upload to this portal will only be visible to MongoDB staff investigating this ticket. Kind regards, | ||||||
| Comment by Greg Murphy [ 31/Mar/16 ] | ||||||
|
Thanks for the quick reply - can you give me a private upload link so that I don't have to attach all the data to the ticket please? (I've got it ready to send when you do). | ||||||
| Comment by Ramon Fernandez Marina [ 30/Mar/16 ] | ||||||
|
gregmurphy, it is possible that the large number of files is indeed slowing things down. Can you please archive and upload the contents of the diagnostic.data directory for the affected secondaries? Seeing the full logs would also help determine if the issue is related to the number of files or if it's something else. Thanks, |