[SERVER-16247] Oplog declines in performance over time under WiredTiger Created: 19/Nov/14 Updated: 25/May/17 Resolved: 15/Dec/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Replication, Storage |
| Affects Version/s: | 2.8.0-rc0 |
| Fix Version/s: | 2.8.0-rc3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Alexander Gorrod |
| Resolution: | Done | Votes: | 1 |
| Labels: | wiredtiger | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
This may be related to Tested on build from master this afternoon (365cca0c47566d192ca847f0b077cedef4b3430e).
|
| Comments |
| Comment by Githook User [ 02/Dec/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}Message: | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 02/Dec/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}Message: | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Alex Gorrod [ 26/Nov/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
I believe there are a two issues involved here: The general downward trend is similar to that seen in The second issue - which causes the peak/trough behavior - is related to flushing pages from memory. In WiredTiger we let updates accumulate in memory until the size of the updates exceed a certain threshold (configured via memory_page_max to wiredTigerCollectionConfig). Once that threshold is reached WiredTiger writes the page to disk (a process WiredTiger calls reconciliation), and starts again with a new page. We are currently designing a change that will mean we don't need to do that write in the application thread. Instead of writing the page to disk immediately we will be able to switch to a new page - which is much faster. | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Nov/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Here's ops/s and WT cached bytes together on one graph showing the perfect correlation. I'm not sure why there should be so much cached data, and growing: the oplog size is 10 MB, and the collection is tiny.
| ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Nov/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Still seeing it with rc1 and also a fresh build from HEAD. Here's my mongod command line:
Set oplog size to 10 MB to see the issue sooner - it doesn't begin until the oplog first wraps. With those parameters and the script above, the oplog holds about 90k docs. Here's the output I'm seeing from the script above:
I enabled wt stats, and it seems that the performance behavior - the short-term declines and recoveries, and the long-term decline - is inversely correlated with the "cache: bytes currently in the cache" statistic:
The recoveries and reduction in cache bytes correlate directly with cache evictions due to pages exceeding in-memory max:
The recoveries don't correlate with checkpoints. | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eliot Horowitz (Inactive) [ 25/Nov/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Can you try on rc1? | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 20/Nov/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Simple gdb profiling shows that when it gets slow it's spending most of its time (9 samples out of 10) here. Details of the top couple of frames vary, but its all in __curfile_next called from cappedDeleteAsNeeded at wiredtiger_record_store.cpp:403. Similar gdb profiling of
|