[SERVER-16247] Oplog declines in performance over time under WiredTiger Created: 19/Nov/14  Updated: 25/May/17  Resolved: 15/Dec/14

Status: Closed
Project: Core Server
Component/s: Performance, Replication, Storage
Affects Version/s: 2.8.0-rc0
Fix Version/s: 2.8.0-rc3

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Alexander Gorrod
Resolution: Done Votes: 1
Labels: wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File bytes-currently-in-cache.png     PNG File correlation.png     PNG File oplog.png     PNG File oplog_insert.png     PNG File pages-evicted-because-exceed-in-memory-max.png    
Issue Links:
Depends
is depended on by SERVER-16366 Gradually degrading performance (wire... Closed
Related
related to SERVER-16296 Producer-consumer use case shows decl... Closed
is related to SERVER-16235 Performance issue in capped collectio... Closed
is related to SERVER-16379 Chunk moves are slow under WiredTiger Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

This may be related to SERVER-16235, but filing as a separate ticket since my understanding is that a workaround is in place to prevent SERVER-16235 from impacting the oplog, plus the symptoms here are a bit different, so this may be a separate issue.

Tested on build from master this afternoon (365cca0c47566d192ca847f0b077cedef4b3430e).

  • Test on single-node replica set repeatedly updates the document in a single-document collection in order to generate a lot of oplog entries while minimizing work to actually perform each op, in order to emphasize oplog performance.
  • Yellow graph below shows declining performance over time (measured in number of updates done). Graph shows same decline starting at first oplog wraparound at about 90 k inserts as seen in SERVER-16235, but then it recovers and begins a cycle of repeated declines and recoveries. But superimposed on this is a longer term downard trend, and that possibly distinguishes this issue from SERVER-16235. Not clear from this data whether the asymptote is >0.
  • Red graph shows that the decline goes away for the same test in a standalone instance, confirming that this is an oplog issue.
  • Blue graph shows that decline is not seen with mmapv1, confirming that this is a WT-specific issue.
  • Restarting mongod (not shown) resets the behavior back to time 0.

    db.c.drop()
    db.c.insert({_id:0, i:0})
 
    var every = 10000
    var bulk = db.c.initializeOrderedBulkOp();
    var t = new Date()
    for (var i=0; i<=count; i++) {
        if (i>0 && i%every==0) {
            bulk.execute();
            bulk = db.c.initializeOrderedBulkOp();
            tt = new Date()
            print(i, Math.floor(every / (tt-t) * 1000))
            t = tt
        }
        bulk.find({_id:0}).updateOne({_id:0, i:i})
    }



 Comments   
Comment by Githook User [ 02/Dec/14 ]

Author:

{u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}

Message: SERVER-16247 Change oplog memory_page_max to 10MB
Branch: master
https://github.com/mongodb/mongo/commit/5c33c7c5b538bc01094127ba6e9c6531a79886d1

Comment by Githook User [ 02/Dec/14 ]

Author:

{u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}

Message: SERVER-16247 Change oplog memory_page_max to 10MB
Branch: master
https://github.com/mongodb/mongo/commit/5c33c7c5b538bc01094127ba6e9c6531a79886d1

Comment by Alex Gorrod [ 26/Nov/14 ]

I believe there are a two issues involved here:

The general downward trend is similar to that seen in SERVER-16235 - i.e: it is related to deleting records from capped collections. It is slightly different in that the degradation is seen even though this use case goes through the special oplog case in the WiredTigerRecordStore::cappedDeleteAsNeeded implementation.

The second issue - which causes the peak/trough behavior - is related to flushing pages from memory. In WiredTiger we let updates accumulate in memory until the size of the updates exceed a certain threshold (configured via memory_page_max to wiredTigerCollectionConfig). Once that threshold is reached WiredTiger writes the page to disk (a process WiredTiger calls reconciliation), and starts again with a new page. We are currently designing a change that will mean we don't need to do that write in the application thread. Instead of writing the page to disk immediately we will be able to switch to a new page - which is much faster.

Comment by Bruce Lucas (Inactive) [ 25/Nov/14 ]

Here's ops/s and WT cached bytes together on one graph showing the perfect correlation. I'm not sure why there should be so much cached data, and growing: the oplog size is 10 MB, and the collection is tiny.

Comment by Bruce Lucas (Inactive) [ 25/Nov/14 ]

Still seeing it with rc1 and also a fresh build from HEAD. Here's my mongod command line:

mongod --dbpath ./r0 --logpath ./r0.log --logappend --port 27017 --replSet rs --oplogSize 10 --fork --storageEngine=wiredtiger

Set oplog size to 10 MB to see the issue sooner - it doesn't begin until the oplog first wraps. With those parameters and the script above, the oplog holds about 90k docs. Here's the output I'm seeing from the script above:

inserts rate
10000  15337
20000  15220
30000  14992
40000  15600
50000  15360
60000  15455
70000  15037
80000  15290
90000  15455   <--- oplog wraps here, decline begins
100000 14144
110000 13440
120000 13368
130000 12853
140000 12787
150000 12531
160000 12468
...
390000  9389
400000  9082
410000  9140
420000  8952
430000  8976
440000 11350  <--- recovers, begins declining again
450000 13458
460000 13333
470000 13157
480000 12970
490000 12787
...
850000  8006
860000  8250
870000  9082
880000 11862  <--- another recovery, then decline
890000 11560
900000 11600
910000 11574
920000 11415
930000 11261
940000 11001
950000 10775
960000 10917
970000 10582
980000 10172
990000 10070

I enabled wt stats, and it seems that the performance behavior - the short-term declines and recoveries, and the long-term decline - is inversely correlated with the "cache: bytes currently in the cache" statistic:

The recoveries and reduction in cache bytes correlate directly with cache evictions due to pages exceeding in-memory max:

The recoveries don't correlate with checkpoints.

Comment by Eliot Horowitz (Inactive) [ 25/Nov/14 ]

Can you try on rc1?
I can't repro on HEAD right now at least.
If you still see it would like to come over and take a look.

Comment by Bruce Lucas (Inactive) [ 20/Nov/14 ]

Simple gdb profiling shows that when it gets slow it's spending most of its time (9 samples out of 10) here. Details of the top couple of frames vary, but its all in __curfile_next called from cappedDeleteAsNeeded at wiredtiger_record_store.cpp:403. Similar gdb profiling of SERVER-16235 shows that's also where it's spending its time.

Thread 2 (Thread 0x7fa7915be700 (LWP 4679)):
#0  __wt_txn_read (upd=0xf89dbe0, session=0x3946f00) at src/third_party/wiredtiger/src/include/txn.i:173
#1  __cursor_row_next (newpage=<optimized out>, cbt=0x4936500) at src/third_party/wiredtiger/src/btree/bt_curnext.c:279
#2  __wt_btcur_next (cbt=cbt@entry=0x30e2680, truncating=truncating@entry=0) at src/third_party/wiredtiger/src/btree/bt_curnext.c:439
#3  0x0000000001331db9 in __curfile_next (cursor=0x30e2680) at src/third_party/wiredtiger/src/cursor/cur_file.c:79
#4  0x0000000000d7b6fe in mongo::WiredTigerRecordStore::cappedDeleteAsNeeded (this=this@entry=0x393c3c0, txn=txn@entry=0x3927f40, justInserted=...) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:403
#5  0x0000000000d7c18e in mongo::WiredTigerRecordStore::insertRecord (this=0x393c3c0, txn=0x3927f40, data=0x37b6f00 \"r\", len=114, enforceQuota=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:507
#6  0x0000000000d7a9dd in mongo::WiredTigerRecordStore::insertRecord (this=0x393c3c0, txn=txn@entry=0x3927f40, doc=doc@entry=0x7fa7915bb480, enforceQuota=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:537
#7  0x00000000009173fc in mongo::Collection::insertDocument (this=0x384a780, txn=txn@entry=0x3927f40, doc=doc@entry=0x7fa7915bb480, enforceQuota=enforceQuota@entry=false) at src/mongo/db/catalog/collection.cpp:178
#8  0x0000000000c46f14 in mongo::repl::_logOpRS (txn=0x3927f40, opstr=<optimized out>, ns=<optimized out>, logNS=<optimized out>, obj=..., o2=0x7fa7915bb7c0, bb=0x0, fromMigrate=false) at src/mongo/db/repl/oplog.cpp:328
#9  0x0000000000c45123 in mongo::repl::logOp (txn=0x3927f40, opstr=opstr@entry=0x1565fc5 \"u\", ns=0xdb12918 \"test.c\", obj=..., patt=patt@entry=0x7fa7915bb7c0, b=b@entry=0x0, fromMigrate=false) at src/mongo/db/repl/oplog.cpp:452
#10 0x0000000000a322d0 in mongo::UpdateStage::transformAndUpdate (this=this@entry=0x491ea00, oldObj=..., loc=...) at src/mongo/db/exec/update.cpp:551
#11 0x0000000000a32913 in mongo::UpdateStage::work (this=0x491ea00, out=<optimized out>) at src/mongo/db/exec/update.cpp:749
#12 0x0000000000bd1d9a in mongo::PlanExecutor::getNext (this=this@entry=0x4903810, objOut=objOut@entry=0x7fa7915bb9c0, dlOut=dlOut@entry=0x0) at src/mongo/db/query/plan_executor.cpp:288
#13 0x0000000000bd2b7d in mongo::PlanExecutor::executePlan (this=0x4903810) at src/mongo/db/query/plan_executor.cpp:400
#14 0x0000000000b21f91 in mongo::UpdateExecutor::execute (this=this@entry=0x7fa7915bbca0, db=<optimized out>) at src/mongo/db/ops/update_executor.cpp:236
#15 0x00000000009b5a76 in mongo::multiUpdate (txn=0x3927f40, updateItem=..., result=result@entry=0x7fa7915bbdb0) at src/mongo/db/commands/write_commands/batch_executor.cpp:1276
#16 0x00000000009b614a in mongo::WriteBatchExecutor::execUpdate (this=this@entry=0x7fa7915bc280, updateItem=..., upsertedId=upsertedId@entry=0x7fa7915bbe40, error=error@entry=0x7fa7915bbe30) at src/mongo/db/commands/write_commands/batch_executor.cpp:893
#17 0x00000000009b6a72 in mongo::WriteBatchExecutor::bulkExecute (this=this@entry=0x7fa7915bc280, request=..., upsertedIds=upsertedIds@entry=0x7fa7915bc080, errors=errors@entry=0x7fa7915bc060) at src/mongo/db/commands/write_commands/batch_executor.cpp:763
#18 0x00000000009b6fbd in mongo::WriteBatchExecutor::executeBatch (this=this@entry=0x7fa7915bc280, request=..., response=response@entry=0x7fa7915bc2b0) at src/mongo/db/commands/write_commands/batch_executor.cpp:265
#19 0x00000000009b8f40 in mongo::WriteCmd::run (this=<optimized out>, txn=0x3927f40, dbName=..., cmdObj=..., options=<optimized out>, errMsg=..., result=..., fromRepl=false) at src/mongo/db/commands/write_commands/write_commands.cpp:144
#20 0x00000000009d2514 in mongo::_execCommand (txn=txn@entry=0x3927f40, c=c@entry=0x3118e60, dbname=..., cmdObj=..., queryOptions=queryOptions@entry=0, errmsg=..., result=..., fromRepl=fromRepl@entry=false) at src/mongo/db/dbcommands.cpp:1209
#21 0x00000000009d33e5 in mongo::Command::execCommand (txn=txn@entry=0x3927f40, c=c@entry=0x3118e60, queryOptions=queryOptions@entry=0, cmdns=cmdns@entry=0x4950014 \"test.$cmd\", cmdObj=..., result=..., fromRepl=fromRepl@entry=false) at src/mongo/db/dbcommands.cpp:1423
#22 0x00000000009d3f10 in mongo::_runCommands (txn=txn@entry=0x3927f40, ns=0x4950014 \"test.$cmd\", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=fromRepl@entry=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1498
#23 0x0000000000bbff4e in runCommands (fromRepl=false, queryOptions=<optimized out>, anObjBuilder=..., b=..., curop=..., jsobj=..., ns=<optimized out>, txn=0x3927f40) at src/mongo/db/query/new_find.cpp:131
#24 mongo::newRunQuery (txn=txn@entry=0x3927f40, m=..., q=..., curop=..., result=..., fromDBDirectClient=fromDBDirectClient@entry=false) at src/mongo/db/query/new_find.cpp:549
#25 0x0000000000aac773 in receivedQuery (fromDBDirectClient=false, m=..., dbresponse=..., c=..., txn=0x3927f40) at src/mongo/db/instance.cpp:220
#26 mongo::assembleResponse (txn=txn@entry=0x3927f40, m=..., dbresponse=..., remote=..., fromDBDirectClient=fromDBDirectClient@entry=false) at src/mongo/db/instance.cpp:390
#27 0x000000000080dec0 in mongo::MyMessageHandler::process (this=<optimized out>, m=..., port=0x4902460, le=0x4902aa0) at src/mongo/db/db.cpp:190
#28 0x0000000000f0ea31 in mongo::PortMessageServer::handleIncomingMsg (arg=0x30c0980) at src/mongo/util/net/message_server_port.cpp:234
#29 0x00007fa791199182 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#30 0x00007fa790299fbd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Generated at Thu Feb 08 03:40:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.