[SERVER-24514] Global IX Lock for more than 4 minutes on Mongo 3.0.11 with wiredTiger Created: 10/Jun/16 Updated: 21/Jun/17 Resolved: 24/May/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.0.11 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Abhishek Amberkar | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
Hello All, The setup The data The plan The plan was to roll out 3.0 on all secondaries if latencies and rps are close to prod like pattern. The observation In the process of understanding what exactly is happening, I ran db.currentOp() on that node where I found following.
Here you can see two operations are "in progress". First one is a query, which is "waiting for lock" for whopping 255 seconds, whereas another is a "repl writer worker thread", which is again running for 255 seconds and holding Global, Database and Collection level lock with "w" (Intent Exclusive) mode. From the docs here, https://docs.mongodb.org/v3.0/faq/concurrency/#how-does-concurrency-affect-secondaries I understand that
That explains why lock is present, but I am yet to understand why is it taking more than 4 minutes to finish. I would appreciate if you guide me in the right direction to understand what exactly is happening here. Graphs attached. |
| Comments |
| Comment by Kelsey Schubert [ 24/May/17 ] | ||
|
I've posted a suggested workaround on Since this issue has likely been addressed in MongoDB 3.2, I'm going to resolve this ticket. Unfortunately, it is not feasible to backport these improvements to MongoDB 3.0. For support upgrading, please see our Technical Support page. Thank you, | ||
| Comment by Abhishek Amberkar [ 26/Feb/17 ] | ||
|
Hi Thomas, As mentioned in the description, the goal is to move to 3.2, but at present, we are running 2.6 and as per the docs one can not jump directly to 3.2 from 2.6, without taking the intermediate 3.0.x step. | ||
| Comment by Kelsey Schubert [ 21/Feb/17 ] | ||
|
Sorry for letting this ticket slip through the cracks. We've made significant improvement to WiredTiger in MongoDB 3.2. Would you please let us know if upgrading to MongoDB 3.2 resolves the issue? Thank you, | ||
| Comment by Abhishek Amberkar [ 24/Aug/16 ] | ||
|
Hello Thomas, I am using flashback to replay traffic from production secondary. The read traffic is distributed across all the secondaries, except the one which is running 3.0.11 On source node (2.6.x) I use tcpdump + socat to send traffic to target node
On target node (3.0.11) I use pcap_convertor + flashback to replay traffic
| ||
| Comment by Kelsey Schubert [ 19/Aug/16 ] | ||
|
Unfortunately, we have not yet determined the cause of this behavior. So we can continue to investigate, would you be able to provide a script that generates a workload that reproduces this issue? A reproduction script would greatly assist our investigation, and we would be able to use it to benchmark improvements in subsequent versions. Thank you for your help, | ||
| Comment by Abhishek Amberkar [ 20/Jun/16 ] | ||
|
I have also observed exactly same pattern with default WT cache size. | ||
| Comment by Abhishek Amberkar [ 20/Jun/16 ] | ||
|
Hello Ramón, Attached iostat and serverStatus output as requested. Both /dev/xvdb and /dev/xvdcb form a /dev/bcache0 device, which I am using as a data volume. Ignore /dev/xvdca, which is not being used. | ||
| Comment by Abhishek Amberkar [ 14/Jun/16 ] | ||
|
Hello Ramón, Yes, I am going ahead with 1st option for now. I have already configured cache size to be default and waiting for the node to catch up with the 18hrs oplog. Once that is done, I will generate those files and share with you. In fact I already have created those files with 100Gb cache size. So I will share both before and after outputs. | ||
| Comment by Ramon Fernandez Marina [ 14/Jun/16 ] | ||
|
abhishek.amberkar, you're right that you need to upgrade to 3.0 before you can move to 3.2. Thanks for the additional screenshot, but unless you send us the data requested above there's not much more we can do to investigate this behavior. Have you consider which option you'd like to pursue? Have you tried setting the WiredTiger cache size to the default? Thanks, | ||
| Comment by Abhishek Amberkar [ 14/Jun/16 ] | ||
|
Hello Ramón, I have added one more screenshot to show you the behavior of cache after 3.0.12 upgrade. | ||
| Comment by Abhishek Amberkar [ 14/Jun/16 ] | ||
|
Hello Ramón, Just want to update you on earlier setup. After upgrading to 3.0.12, the problem became much worst. For first few hours it showed the same earlier behavior. I updated you after that. Right now I am observing that there's some kind of a deadlock and the node is answering queries very slowly. Lots of repl writer worker threads acquiring IX Global lock (like the one posted in the description above) and oplog is lagging by 18 hours. | ||
| Comment by Abhishek Amberkar [ 14/Jun/16 ] | ||
|
Hello Ramón, Thank you. I will get back with the details required. As far as move to 3.2.x is concerned, I understand from the docs that I can not directly move to 3.2.x from 2.6.x, I have to first upgrade to 3.0.x. And since that is the case, I will have to update all my secondaries, one by one to 3.0.x and then do the switch. Repeat the same procedure to upgrade to 3.2.x from there. This way I will be able to do the switch with close to no downtime. Let me know if you know any better approach which involves no downtime. | ||
| Comment by Ramon Fernandez Marina [ 13/Jun/16 ] | ||
|
Thanks for verifying this abhishek.amberkar. Here are some options we can explore:
I'd like to add that, from the basic details of your setup, this may be tied to a large WiredTiger cache size. We do not recommend making the cache size larger than the default, as it may cause issues in certain workloads, so the simplest thing to try would be to use the default cache size and, if the problem is still there, explore one of the options above. Thanks, | ||
| Comment by Abhishek Amberkar [ 13/Jun/16 ] | ||
|
@Ramon, As suggested, I switched to 3.0.12, but the problem remains the same. I am observing same patterns in the graph. Let me know if you need any more information. Thank you. | ||
| Comment by Abhishek Amberkar [ 13/Jun/16 ] | ||
|
@Ramon Since I am running 2.6.x right now, I can't directly jump to 3.2.x. But I can definitely switch to 3.0.12. I will update here if the same behavior is observed. Thank you. | ||
| Comment by Ramon Fernandez Marina [ 10/Jun/16 ] | ||
|
Thanks for the detailed report abhishek.amberkar. As Michael mentioned this could be Could you please upgrade to 3.0.12 and try to reproduce this behavior? Alternatively, you could upgrade to 3.2.7, which is our latest stable release at this time and also contains a fix for Thanks, |