[SERVER-19136] oplog seems to slow down everything Created: 25/Jun/15  Updated: 30/Jul/15  Resolved: 30/Jul/15

Status: Closed
Project: Core Server
Component/s: Performance, WiredTiger
Affects Version/s: 3.0.1, 3.0.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kay Agahd Assignee: Ramon Fernandez Marina
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File _25-06-2015_19-14.jpeg     JPEG File _25-06-2015_19-15.jpeg    
Issue Links:
Duplicate
duplicates SERVER-18875 Oplog performance on WT degrades over... Closed
Operating System: ALL
Participants:

 Description   

We are using mongodb v3.0.1 and v3.0.3 for two different replSets, each consisting of 3 physical members. Both are using wiredTiger with defaults settings. After some weeks of normal operation, we observed that queries became significantly slower.

By analyzing the log files with mlogvis, we found an unusal high amount of operations on the oplog which did not exists before the slowdown. The longest oplog queries took up to 12 seconds!
Our "solution" was to stepDown the primary, which immediately brought back the replSet to normal speed. 90 minutes later we elected the old (slow) primary again to see whether the problem is reproduceable or not. Until now all is running fast as usual.

Please see attached two screenshots produced by mlogvis. On the first you can clearly see the long during queries on local.oplog.rs, the stepDown and the re-election. The second screenshot shows the analyzed log file of yesterday, where all was still normal as usual.
We can provide the log files if they are kept confidentially.



 Comments   
Comment by Kay Agahd [ 30/Jul/15 ]

This is great news, thanks Ramon & team!
I'll put v3.0.5 in production as soon as possible.

Comment by Ramon Fernandez Marina [ 30/Jul/15 ]

Hi kay.agahd@idealo.de, this is to let you know that MongoDB 3.0.5 was released earlier this week; I'd encourage you to try using it and report back if the oplog performance issues persist. I'm going to close this ticket as a duplicate of SERVER-18875 – but please feel free to open new tickets if you run into other issues.

Thanks,
Ramón.

Comment by Kay Agahd [ 02/Jul/15 ]

ramon.fernandez, good to know that this issue might be fixed already in the next release. Thanks!

Comment by Ramon Fernandez Marina [ 01/Jul/15 ]

Hi kay.agahd@idealo.de, thanks for your report. This may be the same issue reported in SERVER-18875, which is actively worked on and already includes improvements scheduled for both the next stable and the next development releases. I'd encourage you to watch SERVER-18875 for updates, and most probably we'll ask you to re-check this behavior with 3.0.5 when it's released later this month.

Thanks,
Ramón.

Generated at Thu Feb 08 03:49:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.