We are using mongodb v3.0.1 and v3.0.3 for two different replSets, each consisting of 3 physical members. Both are using wiredTiger with defaults settings. After some weeks of normal operation, we observed that queries became significantly slower.
By analyzing the log files with mlogvis, we found an unusal high amount of operations on the oplog which did not exists before the slowdown. The longest oplog queries took up to 12 seconds!
Our "solution" was to stepDown the primary, which immediately brought back the replSet to normal speed. 90 minutes later we elected the old (slow) primary again to see whether the problem is reproduceable or not. Until now all is running fast as usual.
Please see attached two screenshots produced by mlogvis. On the first you can clearly see the long during queries on local.oplog.rs, the stepDown and the re-election. The second screenshot shows the analyzed log file of yesterday, where all was still normal as usual.
We can provide the log files if they are kept confidentially.