|
sue.loverso Only if I put journal and data on the same device, otherwise it looks OK
|
|
martin.bligh are you still seeing WT log stalls? Do you have data/stats from that? A large number of slot join races implies that we're waiting for a free slot. That means we may not be writing them out to the OS timely enough or we're stuck waiting for something.
|
|
dan@10gen.com Nope. fixing ... looking much better now. Do we still want to address that, or close this back out?
|
|
Also, FTDC data please? Want to check if it's similar to something Ramon showed me yesterday, involving a large number of slot join races during the slowdowns.
|
|
Are. You running with journal on separate volume?
|
|
If I restart with --nojournal, the problem goes away.
|
- 5.05% mongod mongod [.] __wt_row_modify
|
- __wt_row_modify
|
- 99.67% __wt_btcur_insert
|
- __curfile_insert
|
+ 53.12% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mon
|
+ 46.88% mongo::WiredTigerRecordStore::insertRecords(mongo::OperationContext*, std::vec
|
- 4.66% mongod [kernel.kallsyms] [k] __schedule
|
- __schedule
|
- 96.80% sys_sched_yield
|
system_call_fastpath
|
__sched_yield
|
__wt_log_write
|
__wt_txn_commit
|
__session_commit_transaction
|
mongo::WiredTigerRecoveryUnit::_txnClose(bool)
|
mongo::WiredTigerRecoveryUnit::_commit()
|
mongo::insertMultiVector(mongo::OperationContext*, mongo::OldClientContext&, bool, char
|
mongo::insertMulti(mongo::OperationContext*, mongo::OldClientContext&, bool, char const*
|
mongo::_receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&, char con
|
mongo::receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&, mongo::Me
|
mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, m
|
mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*)
|
mongo::PortMessageServer::handleIncomingMsg(void*)
|
start_thread
|
+ 3.00% system_call_fastpath
|
- 4.42% mongod mongod [.] __wt_log_slot_join
|
- __wt_log_slot_join
|
- 99.96% __wt_log_write
|
__wt_txn_commit
|
__session_commit_transaction
|
mongo::WiredTigerRecoveryUnit::_txnClose(bool)
|
mongo::WiredTigerRecoveryUnit::_commit()
|
mongo::insertMultiVector(mongo::OperationContext*, mongo::OldClientContext&, bool, char
|
mongo::insertMulti(mongo::OperationContext*, mongo::OldClientContext&, bool, char const*
|
mongo::_receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&, char con
|
mongo::receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&, mongo::Me
|
mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, m
|
mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*)
|
mongo::PortMessageServer::handleIncomingMsg(void*)
|
start_thread
|
- 4.41% mongod [kernel.kallsyms] [k] put_prev_task_fair
|
- put_prev_task_fair
|
- 93.89% __schedule
|
- 99.98% sys_sched_yield
|
system_call_fastpath
|
- __sched_yield
|
- 99.98% __wt_log_write
|
__wt_txn_commit
|
__session_commit_transaction
|
mongo::WiredTigerRecoveryUnit::_txnClose(bool)
|
mongo::WiredTigerRecoveryUnit::_commit()
|
mongo::insertMultiVector(mongo::OperationContext*, mongo::OldClientContext&, bo
|
mongo::insertMulti(mongo::OperationContext*, mongo::OldClientContext&, bool, ch
|
mongo::_receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&,
|
mongo::receivedInsert(mongo::OperationContext*, mongo::NamespaceString const&,
|
mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbRes
|
mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*
|
mongo::PortMessageServer::handleIncomingMsg(void*)
|
start_thread
|
+ 6.09% sys_sched_yield
|
|
|
I'm still getting large stalls
localhost:27017 1463356 *0 *0 *0 0 150|0 71.1 78.4 0 1.4G 1.2G 0|2 0|27 739m 33k 34 09:52:10
|
localhost:27017 1471816 *0 *0 *0 0 148|0 89.4 91.5 0 1.4G 1.2G 0|2 0|30 744m 33k 34 09:52:11
|
localhost:27017 1682940 *0 *0 *0 0 169|0 83.6 85.7 0 1.4G 1.2G 0|0 0|32 850m 35k 34 09:52:12
|
localhost:27017 1642508 *0 *0 *0 0 164|0 70.3 72.2 0 1.4G 1.2G 0|3 0|29 830m 34k 34 09:52:13
|
localhost:27017 1717792 *0 *0 *0 0 179|0 82.4 84.4 0 1.4G 1.2G 0|1 0|29 868m 35k 34 09:52:14
|
localhost:27017 173536 *0 *0 *0 0 11|0 14.4 16.3 0 1.4G 1.2G 0|0 0|32 88m 19k 34 09:52:15
|
localhost:27017 *0 *0 *0 *0 0 1|0 14.4 16.3 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:16
|
localhost:27017 *0 *0 *0 *0 0 1|0 14.4 16.3 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:17
|
localhost:27017 *0 *0 *0 *0 0 1|0 14.4 16.3 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:18
|
localhost:27017 1049744 *0 *0 *0 0 112|0 80.0 82.0 0 1.4G 1.2G 0|0 0|30 531m 29k 34 09:52:19
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1597220 *0 *0 *0 0 160|0 90.0 91.9 0 1.4G 1.2G 0|2 0|29 807m 34k 34 09:52:20
|
localhost:27017 1528928 *0 *0 *0 0 155|0 68.1 78.9 0 1.4G 1.2G 0|0 0|31 772m 33k 34 09:52:21
|
localhost:27017 1539116 *0 *0 *0 0 155|0 66.8 68.2 0 1.4G 1.2G 0|0 0|26 778m 33k 34 09:52:22
|
localhost:27017 1719500 *0 *0 *0 0 174|0 87.4 89.7 0 1.4G 1.2G 0|0 0|32 869m 35k 34 09:52:23
|
localhost:27017 *0 *0 *0 *0 0 2|0 15.6 16.9 0 1.4G 1.2G 0|0 0|32 133b 19k 34 09:52:24
|
localhost:27017 *0 *0 *0 *0 0 1|0 15.6 16.9 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:25
|
localhost:27017 *0 *0 *0 *0 0 1|0 15.6 16.9 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:26
|
localhost:27017 *0 *0 *0 *0 0 1|0 15.6 16.9 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:27
|
localhost:27017 101056 *0 *0 *0 0 7|0 22.2 23.5 0 1.4G 1.2G 0|1 0|30 51m 19k 34 09:52:28
|
localhost:27017 1678336 *0 *0 *0 0 174|0 76.3 77.6 0 1.4G 1.2G 0|1 0|28 848m 35k 34 09:52:29
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 698680 *0 *0 *0 0 66|0 34.5 35.6 0 1.4G 1.2G 0|0 0|32 353m 25k 34 09:52:30
|
localhost:27017 1452456 *0 *0 *0 0 148|0 74.2 75.3 0 1.4G 1.2G 0|3 0|27 734m 33k 34 09:52:31
|
localhost:27017 1025588 *0 *0 *0 0 103|0 64.9 66.0 0 1.4G 1.2G 0|0 0|32 519m 28k 34 09:52:32
|
localhost:27017 *0 *0 *0 *0 0 1|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:33
|
localhost:27017 *0 *0 *0 *0 0 2|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 133b 19k 34 09:52:34
|
localhost:27017 *0 *0 *0 *0 0 1|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:35
|
localhost:27017 *0 *0 *0 *0 0 1|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:36
|
localhost:27017 *0 *0 *0 *0 0 1|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:37
|
localhost:27017 *0 *0 *0 *0 0 1|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:38
|
localhost:27017 *0 *0 *0 *0 0 2|0 33.7 34.7 0 1.4G 1.2G 0|0 0|32 133b 19k 34 09:52:39
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1325248 *0 *0 *0 0 137|0 70.1 71.1 0 1.4G 1.2G 0|0 0|27 669m 32k 34 09:52:40
|
localhost:27017 1694372 *0 *0 *0 0 170|0 82.2 83.1 0 1.4G 1.2G 0|0 0|31 856m 35k 34 09:52:41
|
localhost:27017 1666300 *0 *0 *0 0 170|0 94.2 95.1 0 1.4G 1.2G 0|1 0|30 842m 35k 34 09:52:42
|
localhost:27017 1634836 *0 *0 *0 0 159|0 82.1 83.0 0 1.4G 1.2G 0|1 0|29 826m 34k 34 09:52:43
|
localhost:27017 1672456 *0 *0 *0 0 169|0 73.2 74.1 0 1.4G 1.2G 0|1 0|29 845m 34k 34 09:52:44
|
localhost:27017 1675736 *0 *0 *0 0 170|0 82.0 82.8 0 1.4G 1.2G 0|0 0|30 847m 35k 34 09:52:45
|
localhost:27017 1576848 *0 *0 *0 0 159|0 91.8 93.6 0 1.4G 1.2G 0|0 0|30 797m 34k 34 09:52:46
|
localhost:27017 1585000 *0 *0 *0 0 161|0 86.1 87.0 0 1.4G 1.2G 0|2 0|28 801m 34k 34 09:52:47
|
localhost:27017 1510916 *0 *0 *0 0 153|0 69.8 70.7 0 1.4G 1.2G 0|1 0|29 764m 33k 34 09:52:48
|
localhost:27017 1675080 *0 *0 *0 0 168|0 81.1 82.8 0 1.4G 1.2G 0|0 0|29 847m 34k 34 09:52:49
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1631108 *0 *0 *0 0 163|0 86.8 87.7 0 1.4G 1.2G 0|0 0|32 824m 34k 34 09:52:50
|
localhost:27017 1512148 *0 *0 *0 0 152|0 72.7 73.5 0 1.4G 1.2G 0|0 0|31 764m 33k 34 09:52:51
|
localhost:27017 1550116 *0 *0 *0 0 157|0 75.7 76.5 0 1.4G 1.2G 0|0 0|32 784m 33k 34 09:52:52
|
localhost:27017 1594156 *0 *0 *0 0 161|0 89.0 89.7 0 1.4G 1.2G 0|0 0|32 805m 34k 34 09:52:53
|
localhost:27017 *0 *0 *0 *0 0 2|0 23.9 24.5 0 1.4G 1.2G 0|0 0|32 133b 19k 34 09:52:54
|
localhost:27017 *0 *0 *0 *0 0 1|0 23.9 24.5 0 1.4G 1.2G 0|0 0|32 79b 19k 34 09:52:55
|
localhost:27017 1189956 *0 *0 *0 0 120|0 86.6 87.3 0 1.4G 1.2G 0|0 0|30 601m 30k 34 09:52:56
|
localhost:27017 1636408 *0 *0 *0 0 165|0 92.7 93.3 0 1.4G 1.2G 0|1 0|30 827m 34k 34 09:52:57
|
|
|
Author:
{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}
Message: Import wiredtiger-wiredtiger-mongodb-3.2-rc4-56-g7a4f325.tar.gz from wiredtiger branch mongodb-3.2
ref: 8326df6..7a4f325
e731ef8 WT-2251 Free addresses when discarding deleted page references.
0e93d60 SERVER-21691 Avoid insert stalls
264ec21 WT-2249 Keep eviction stuck until cache usage is under 100%.
dca1411 WT-2250 Minor fix: use SET instead of INCRV for stat.
Branch: master
https://github.com/mongodb/mongo/commit/86e7b69a6c52c926d28a60d816faefa6db81eb96
|
|
We're vastly better indeed now, and this is indeed an extreme workload, so sounds like a plan. I attached a ftdc plot FYI. The perf data was a two-second sample during one of the reduced throughput cases.
|
|
geert.bosch, I suggest we open a 3.3 required ticket for the remaining stalls: my focus today was RC5.
In my testing I saw lower numbers overall than you but with less variation. perf output isn't very enlightening unless it can focus on those single seconds: there could be various factors in play, let's start with ftdc data and take it from there.
|
|
This seems to take care of the worst stalls and longer periods of (near) inactivity, but there are still 1-2 second stalls:
1095228 *0 *0 *0 0 112|0 0 36.3 KB 36.0 KB Database:0.0% 0|0 0|29 553m 28k 35 23:29:54
|
542124 *0 *0 *0 0 57|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|32 274m 23k 35 23:29:55
|
*0 *0 *0 *0 0 1|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|32 79b 17k 35 23:29:56
|
957720 *0 *0 *0 0 95|0 0 36.3 KB 36.0 KB Collection:0.0% 0|0 0|29 483m 26k 35 23:29:57
|
1107056 *0 *0 *0 0 112|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|31 559m 28k 35 23:29:58
|
1038656 *0 *0 *0 0 103|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|31 524m 27k 35 23:29:59
|
insert query update delete getmore command flushes vsize res locked db qr|qw ar|aw netIn netOut conn time
|
1092360 *0 *0 *0 0 115|0 0 36.3 KB 36.0 KB Database:0.0% 0|0 0|30 552m 28k 35 23:30:00
|
1088476 *0 *0 *0 0 108|0 0 36.3 KB 36.0 KB Collection:0.0% 0|0 0|30 549m 28k 35 23:30:01
|
1100572 *0 *0 *0 0 114|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|31 556m 28k 35 23:30:02
|
1082972 *0 *0 *0 0 110|0 0 36.3 KB 36.0 KB Metadata:0.0% 0|0 0|30 547m 28k 35 23:30:03
|
675520 *0 *0 *0 0 66|0 0 36.3 KB 36.1 KB Metadata:0.0% 0|0 0|32 341m 24k 35 23:30:04
|
164604 *0 *0 *0 0 17|0 0 36.3 KB 36.1 KB Metadata:0.0% 0|0 0|32 83m 19k 35 23:30:05
|
800 *0 *0 *0 0 1|0 0 36.3 KB 36.1 KB Metadata:0.0% 0|0 0|32 404k 17k 35 23:30:06
|
974796 *0 *0 *0 0 99|0 0 36.3 KB 36.1 KB Database:0.0% 0|0 0|30 492m 27k 35 23:30:07
|
1091244 *0 *0 *0 0 110|0 0 36.3 KB 36.1 KB Collection:0.0% 0|0 0|30 551m 28k 35 23:30:08
|
perf top output:
- 18.28% mongod mongod [.] __wt_row_modify
|
- __wt_row_modify
|
- 99.75% __wt_btcur_insert
|
- __curfile_insert
|
+ 50.84% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)
|
+ 49.16% mongo::WiredTigerRecordStore::insertRecords(mongo::OperationContext*, std::vector<mongo::Record, std::alloc
|
- 4.83% mongod mongod [.] __wt_cache_eviction_check.part.7.constprop.16
|
- __wt_cache_eviction_check.part.7.constprop.16
|
+ 98.93% __curfile_insert
|
+ 1.07% __wt_btcur_insert
|
+ 3.60% mongod mongod [.] __wt_search_insert
|
+ 3.22% mongod mongod [.] __wt_row_search
|
+ 2.42% mongod mongod [.] __wt_hazard_set
|
+ 1.90% mongod mongod [.] __rec_txn_read
|
+ 1.56% mongod libc-2.19.so [.] __memcpy_sse2_unaligned
|
|
|
Author:
{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}
Message: Merge pull request #2349 from wiredtiger/SERVER-21691
SERVER-21691 Avoid insert stalls
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/0e93d60d0d86795b6d14187080bdcd5e8502c140
|
|
Author:
{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}
Message: Merge pull request #2349 from wiredtiger/SERVER-21691
SERVER-21691 Avoid insert stalls
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/0e93d60d0d86795b6d14187080bdcd5e8502c140
|
|
Author:
{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}
Message: SERVER-21691 Don't retry eviction of a page if transaction state hasn't changed.
This check was moved out of __wt_page_can_evict, which meant it only applied to LRU eviction. Move it back so that we don't repeatedly try forced eviction that has no chance of succeeding.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/cf62c714ced9cd518baf1f52c4295650b53e69d2
|
|
Author:
{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}
Message: SERVER-21691 Avoid blocking in-memory splits.
Only trylock the page's reconciliation lock (so in-memory splits aren't blocked by slow splits of siblings), don't give up as soon as a checkpoint starts.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/8cb4ecce11762679a07f632c1ff3d7a7732ef023
|
|
I can reproduce this, low throughput (often 0 .. 64 ops/sec) lasts for about 30 seconds before bouncing back up.
- 16.99% mongod [.] __rec_txn_read
|
- __rec_txn_read
|
- 99.69% __rec_row_leaf_insert
|
__rec_row_leaf
|
- __wt_reconcile
|
- 74.17% __wt_cache_op
|
__checkpoint_worker.isra.9
|
__checkpoint_apply
|
__txn_checkpoint
|
__wt_txn_checkpoint
|
__session_checkpoint
|
__ckpt_server
|
start_thread
|
+ 25.83% __wt_evict
|
- 14.65% mongod [.] __wt_row_modify
|
- __wt_row_modify
|
+ 89.73% __wt_btcur_insert
|
+ 10.17% __split_multi_inmem
|
- 11.79% mongod [.] __wt_cksum_hw
|
- __wt_cksum_hw
|
+ 86.39% __rec_split_write
|
+ 12.83% __wt_block_write_off
|
+ 0.78% __wt_log_write
|
- 9.35% mongod [.] snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int)
|
- snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int)
|
+ 99.95% snappy::Compress(snappy::Source*, snappy::Sink*)
|
+ 3.51% mongod [.] __wt_cache_eviction_check.part.7.constprop.16
|
+ 3.10% mongod [.] __wt_search_insert
|
+ 2.89% mongod [.] tc_free
|
+ 2.86% mongod [.] __wt_row_search
|
+ 2.57% mongod [.] __wt_tree_walk
|
+ 1.95% mongod [.] __free_skip_list
|
+ 1.93% mongod [.] __wt_free_update_list
|
+ 1.90% mongod [.] __wt_hazard_set
|
+ 1.48% mongod [.] __rec_cell_build_leaf_key
|
+ 1.38% mongod [.] __wt_btcur_insert
|
+ 1.20% mongod [.] __rec_row_leaf_insert
|
+ 1.16% mongod [.] tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**)
|
+ 1.15% mongod [.] tc_calloc
|
+ 1.15% mongod [.] __wt_struct_pack
|
+ 1.01% mongod [.] __wt_page_in_func
|
+ 1.01% mongod [.] __wt_struct_size
|
+ 0.99% mongod [.] tcmalloc::CentralFreeList::ReleaseToSpans(void*)
|
+ 0.71% mongod [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int)
|
+ 0.63% mongod [.] operator new(unsigned long)
|
+ 0.52% mongod [.] __wt_calloc
|
+ 0.50% mongod [.] operator delete(void*)
|
|
|
If we go back to -rc4, it drops out to zero much more frequently, but always bounces back. With current master it falls down and stays down.
Ramon can't repro this on his laptop, but Geert can on his desktop (no params needed for mongod, just default standalone).
|
|
Footnote: with --wiredTigerCacheSizeGB=1 this doesn't happen.
Occasional drop for a second or two from 1.5M/s to 1M/s as dirty% drops, but much much better.
|
|
Thanks Keith. Martin also reported he tried the patch for WT-2249 and it didn't help.
|
|
I doubt it's connected to WT-2249, that happens when the cache is tiny and we can't find anything to evict (we do a lot of testing with tiny cache sizes to stress eviction).
|
|

- initial stalls A-B look a lot like
SERVER-21652
- something rather different seems to be happening starting at C
- looks like we are trying very hard to evict stuff (see pages walked for eviction), yet according to stats cache is not completely full?
|
|
keith.bostic I think about 15s for the first few minutes, starting every 100s or so (IIRC, checkpoints should be aligned starts on 60s?) then it seems to all go haywire and be much less predictable. The time spacing in the mongostat data above is 1s intervals.
|
|
Worse, after a few minutes, insert performance drops persistently to about 10% of what it was.
Perf top:
20.50% mongod [.] __wt_row_modify
|
13.81% mongod [.] __rec_txn_read
|
4.55% mongod [.] __wt_tree_walk
|
3.55% mongod [.] __wt_row_search
|
3.44% libc-2.19.so [.] __memcpy_sse2_unaligned
|
54.42% mongod [.] __wt_page_in_func
|
9.32% mongod [.] __wt_row_modify
|
4.96% mongod [.] __rec_txn_read
|
4.50% mongod [.] __wt_tree_walk
|
1.86% mongod [.] __wt_cache_eviction_worker
|
Callgraph:
- 24.08% mongod mongod [.] __wt_row_modify
|
- __wt_row_modify
|
- 96.25% __split_multi_inmem
|
__wt_split_rewrite
|
- __wt_evict
|
- 99.45% __wt_page_in_func
|
__wt_row_search
|
- __wt_btcur_insert
|
+ 100.00% __curfile_insert
|
+ 0.55% __curfile_leave
|
+ 3.64% __wt_btcur_insert
|
- 14.14% mongod mongod [.] __rec_txn_read
|
- __rec_txn_read
|
- 99.61% __rec_row_leaf_insert
|
- __wt_reconcile
|
- 73.21% __wt_evict
|
- 99.34% __wt_page_in_func
|
__wt_row_search
|
- __wt_btcur_insert
|
+ 100.00% __curfile_insert
|
+ 0.66% __curfile_leave
|
+ 26.79% __wt_cache_op
|
- 12.24% mongod mongod [.] __wt_tree_walk
|
- __wt_tree_walk
|
- 99.87% __evict_server
|
start_thread
|
+ 3.57% mongod mongod [.] __wt_row_search
|
- 3.02% mongod libpthread-2.19.so [.] pthread_mutex_unlock
|
- pthread_mutex_unlock
|
- 95.78% __split_multi_inmem
|
__wt_split_rewrite
|
- __wt_evict
|
- 99.40% __wt_page_in_func
|
__wt_row_search
|
- __wt_btcur_insert
|
+ 100.00% __curfile_insert
|
+ 0.60% __curfile_leave
|
+ 3.59% __wt_btcur_insert
|
+ 2.78% mongod mongod [.] tcmalloc::CentralFreeList::RemoveR
|
- 2.78% mongod libpthread-2.19.so [.] pthread_mutex_lock
|
- pthread_mutex_lock
|
- 94.88% __split_multi_inmem
|
__wt_split_rewrite
|
- __wt_evict
|
- 99.36% __wt_page_in_func
|
__wt_row_search
|
- __wt_btcur_insert
|
+ 100.00% __curfile_insert
|
+ 0.64% __curfile_leave
|
+ 4.46% __wt_btcur_insert
|
|
|
Yes, likely. How long are the stalls lasting, and they're not aligned with checkpoint?
|
localhost:27017 1418372 *0 *0 *0 0 139|0 71.9 95.7 0 18.1G 17.9G 0|0 0|31 717m 32k 34 09:28:55
|
localhost:27017 1002588 *0 *0 *0 0 105|0 69.2 93.4 0 18.1G 17.9G 0|0 0|32 507m 28k 34 09:28:56
|
localhost:27017 1399548 *0 *0 *0 0 134|0 70.2 93.3 0 18.1G 17.9G 0|3 0|28 707m 31k 34 09:28:57
|
localhost:27017 1470036 *0 *0 *0 0 154|0 70.4 94.4 0 18.1G 17.9G 0|3 0|27 743m 33k 34 09:28:58
|
localhost:27017 1452236 *0 *0 *0 0 151|0 70.7 94.9 0 18.1G 17.9G 0|3 0|30 734m 33k 34 09:28:59
|
localhost:27017 1519036 *0 *0 *0 0 142|0 71.3 95.9 0 18.1G 17.9G 0|2 0|29 768m 32k 34 09:29:00
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1408736 *0 *0 *0 0 145|0 70.9 95.3 0 18.1G 17.9G 0|0 0|32 712m 32k 34 09:29:01
|
localhost:27017 1507972 *0 *0 *0 0 148|0 70.9 95.5 0 18.1G 17.9G 0|3 0|28 762m 33k 34 09:29:02
|
localhost:27017 1484356 *0 *0 *0 0 155|0 72.2 95.4 0 18.1G 17.9G 0|0 0|30 750m 33k 34 09:29:03
|
localhost:27017 1455028 *0 *0 *0 0 142|0 72.6 95.9 0 18.1G 17.9G 0|0 0|32 735m 32k 34 09:29:04
|
localhost:27017 1439364 *0 *0 *0 0 154|0 72.9 95.5 0 18.1G 17.9G 0|5 0|25 727m 33k 34 09:29:05
|
localhost:27017 1499044 *0 *0 *0 0 147|0 72.9 95.8 0 18.1G 17.9G 0|2 0|28 758m 33k 34 09:29:06
|
localhost:27017 1408292 *0 *0 *0 0 148|0 72.5 95.6 0 18.1G 17.9G 0|3 0|26 712m 33k 34 09:29:07
|
localhost:27017 1447300 *0 *0 *0 0 143|0 71.8 95.8 0 18.1G 17.9G 0|2 0|27 731m 32k 34 09:29:08
|
localhost:27017 1476508 *0 *0 *0 0 146|0 72.6 95.6 0 18.1G 17.9G 0|0 0|30 746m 32k 34 09:29:09
|
localhost:27017 269780 *0 *0 *0 0 32|0 67.5 89.6 0 18.1G 17.9G 0|0 0|32 136m 21k 34 09:29:10
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 *0 *0 *0 *0 0 1|0 60.4 82.9 0 18.1G 17.9G 0|0 0|32 79b 19k 34 09:29:11
|
localhost:27017 2208 *0 *0 *0 0 2|0 53.6 76.9 0 18.1G 17.9G 0|0 0|32 1m 19k 34 09:29:12
|
localhost:27017 398484 *0 *0 *0 0 31|0 51.4 76.8 0 18.1G 17.9G 0|0 0|32 201m 21k 34 09:29:13
|
localhost:27017 150772 *0 *0 *0 0 16|0 49.2 77.4 0 18.1G 17.9G 0|0 0|32 76m 20k 34 09:29:14
|
localhost:27017 300 *0 *0 *0 0 1|0 45.6 77.4 0 18.1G 17.9G 0|0 0|32 152k 19k 34 09:29:15
|
localhost:27017 277428 *0 *0 *0 0 29|0 43.7 78.2 0 18.1G 17.9G 0|0 0|32 140m 21k 34 09:29:16
|
localhost:27017 233608 *0 *0 *0 0 25|0 41.7 79.0 0 18.1G 17.9G 0|0 0|32 118m 21k 34 09:29:17
|
localhost:27017 800 *0 *0 *0 0 2|0 38.2 79.0 0 18.1G 17.9G 0|0 0|32 404k 19k 34 09:29:18
|
localhost:27017 273300 *0 *0 *0 0 29|0 36.4 79.7 0 18.1G 17.9G 0|1 0|30 138m 21k 34 09:29:19
|
localhost:27017 254772 *0 *0 *0 0 26|0 32.2 75.2 0 18.1G 17.9G 0|0 0|32 129m 21k 34 09:29:20
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 *0 *0 *0 *0 0 1|0 27.5 71.4 0 18.1G 17.9G 0|0 0|32 79b 19k 34 09:29:21
|
localhost:27017 62920 *0 *0 *0 0 7|0 24.7 71.5 0 18.1G 17.9G 0|0 0|32 32m 19k 34 09:29:22
|
localhost:27017 351744 *0 *0 *0 0 37|0 23.8 72.7 0 18.1G 17.9G 0|0 0|32 178m 22k 34 09:29:23
|
localhost:27017 *0 *0 *0 *0 0 1|0 20.8 72.6 0 18.1G 17.9G 0|0 0|32 79b 19k 34 09:29:24
|
localhost:27017 88000 *0 *0 *0 0 11|0 19.3 72.6 0 18.1G 17.9G 0|0 0|32 44m 20k 34 09:29:25
|
localhost:27017 64 *0 *0 *0 0 1|0 16.3 72.6 0 18.1G 17.9G 0|0 0|32 79b 19k 34 09:29:26
|
localhost:27017 664 *0 *0 *0 0 1|0 13.5 72.6 0 18.1G 17.9G 0|0 0|32 303k 19k 34 09:29:27
|
localhost:27017 464296 *0 *0 *0 0 49|0 13.3 74.0 0 18.1G 17.9G 0|0 0|32 234m 23k 34 09:29:28
|
localhost:27017 *0 *0 *0 *0 0 1|0 10.5 74.0 0 18.1G 17.9G 0|0 0|32 79b 19k 34 09:29:29
|
localhost:27017 645932 *0 *0 *0 0 65|0 9.1 73.1 0 18.1G 17.9G 0|1 0|30 327m 25k 34 09:29:30
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1444176 *0 *0 *0 0 156|0 11.1 75.0 0 18.1G 17.9G 0|1 0|29 729m 33k 34 09:29:31
|
localhost:27017 1454072 *0 *0 *0 0 147|0 16.0 80.0 0 18.1G 17.9G 0|1 0|28 735m 32k 34 09:29:32
|
localhost:27017 1526200 *0 *0 *0 0 151|0 21.1 82.1 0 18.1G 17.9G 0|2 0|29 771m 33k 34 09:29:33
|
localhost:27017 1498280 *0 *0 *0 0 146|0 26.1 79.1 0 18.1G 17.9G 0|1 0|26 757m 32k 34 09:29:34
|
localhost:27017 1486728 *0 *0 *0 0 158|0 31.2 79.8 0 18.1G 17.9G 0|0 0|29 751m 33k 34 09:29:35
|
localhost:27017 1585264 *0 *0 *0 0 153|0 36.6 77.6 0 18.1G 17.9G 0|0 0|30 801m 33k 34 09:29:36
|
localhost:27017 1590892 *0 *0 *0 0 161|0 41.9 80.0 0 18.1G 17.9G 0|1 0|31 804m 34k 34 09:29:37
|
localhost:27017 1551688 *0 *0 *0 0 156|0 47.0 78.7 0 18.1G 17.9G 0|2 0|30 784m 33k 34 09:29:38
|
localhost:27017 1642536 *0 *0 *0 0 162|0 52.6 80.0 0 18.1G 17.9G 0|0 0|31 830m 34k 34 09:29:39
|
localhost:27017 1566220 *0 *0 *0 0 164|0 57.8 79.1 0 18.1G 17.9G 0|0 0|30 791m 34k 34 09:29:40
|
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1484436 *0 *0 *0 0 148|0 71.7 94.6 0 18.3G 18.1G 0|2 0|30 750m 33k 34 09:30:42
|
localhost:27017 1332080 *0 *0 *0 0 133|0 71.7 94.7 0 18.3G 18.1G 0|0 0|30 673m 31k 34 09:30:43
|
localhost:27017 1315220 *0 *0 *0 0 137|0 72.2 95.3 0 18.3G 18.1G 0|1 0|28 664m 31k 34 09:30:44
|
localhost:27017 1433156 *0 *0 *0 0 147|0 72.2 95.6 0 18.3G 18.1G 0|0 0|30 724m 32k 34 09:30:45
|
localhost:27017 1361200 *0 *0 *0 0 131|0 72.1 95.3 0 18.3G 18.1G 0|3 0|29 688m 31k 34 09:30:46
|
localhost:27017 1265816 *0 *0 *0 0 127|0 72.2 94.5 0 18.3G 18.1G 0|0 0|30 640m 31k 34 09:30:47
|
localhost:27017 1443544 *0 *0 *0 0 153|0 73.4 95.1 0 18.3G 18.1G 0|0 0|32 730m 33k 34 09:30:48
|
localhost:27017 1356840 *0 *0 *0 0 135|0 72.8 95.8 0 18.3G 18.1G 0|5 0|23 685m 31k 34 09:30:49
|
localhost:27017 1207564 *0 *0 *0 0 119|0 71.9 94.4 0 18.3G 18.1G 0|1 0|29 610m 30k 34 09:30:50
|
localhost:27017 1148444 *0 *0 *0 0 120|0 70.9 93.2 0 18.3G 18.1G 0|0 0|29 580m 30k 34 09:30:51
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1506208 *0 *0 *0 0 147|0 71.1 95.2 0 18.3G 18.1G 0|0 0|30 761m 32k 34 09:30:52
|
localhost:27017 1408084 *0 *0 *0 0 146|0 72.6 96.0 0 18.3G 18.1G 0|0 0|31 711m 32k 34 09:30:53
|
localhost:27017 1409372 *0 *0 *0 0 139|0 72.9 95.4 0 18.3G 18.1G 0|0 0|30 712m 32k 34 09:30:54
|
localhost:27017 1408872 *0 *0 *0 0 142|0 72.9 95.6 0 18.3G 18.1G 0|0 0|31 712m 32k 34 09:30:55
|
localhost:27017 1271672 *0 *0 *0 0 135|0 72.6 95.4 0 18.3G 18.1G 0|4 0|28 643m 31k 34 09:30:56
|
localhost:27017 1382820 *0 *0 *0 0 139|0 73.5 95.4 0 18.3G 18.1G 0|0 0|29 699m 32k 34 09:30:57
|
localhost:27017 1212352 *0 *0 *0 0 118|0 71.8 95.3 0 18.3G 18.1G 0|0 0|32 613m 30k 34 09:30:58
|
localhost:27017 1426420 *0 *0 *0 0 143|0 71.5 95.8 0 18.3G 18.1G 0|0 0|31 721m 32k 34 09:30:59
|
localhost:27017 1380580 *0 *0 *0 0 138|0 73.1 95.3 0 18.3G 18.1G 0|0 0|32 698m 32k 34 09:31:00
|
localhost:27017 129508 *0 *0 *0 0 10|0 68.4 90.7 0 18.3G 18.1G 0|0 0|32 66m 19k 34 09:31:01
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 *0 *0 *0 *0 0 1|0 61.6 83.7 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:02
|
localhost:27017 *0 *0 *0 *0 0 2|0 55.1 78.6 0 18.3G 18.1G 0|0 0|32 133b 19k 34 09:31:03
|
localhost:27017 64 *0 *0 *0 0 1|0 51.5 78.3 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:04
|
localhost:27017 438336 *0 *0 *0 0 53|0 51.4 79.7 0 18.3G 18.1G 0|0 0|32 221m 24k 34 09:31:05
|
localhost:27017 *0 *0 *0 *0 0 1|0 48.4 79.7 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:06
|
localhost:27017 18400 *0 *0 *0 0 1|0 45.2 79.8 0 18.3G 18.1G 0|0 0|32 10m 19k 34 09:31:07
|
localhost:27017 489368 *0 *0 *0 0 50|0 44.1 76.8 0 18.3G 18.1G 0|0 0|32 247m 23k 34 09:31:08
|
localhost:27017 *0 *0 *0 *0 0 1|0 38.1 69.4 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:09
|
localhost:27017 8096 *0 *0 *0 0 2|0 35.0 69.4 0 18.3G 18.1G 0|0 0|32 4m 19k 34 09:31:10
|
localhost:27017 511464 *0 *0 *0 0 48|0 34.5 71.0 0 18.3G 18.1G 0|0 0|32 258m 23k 34 09:31:11
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 124864 *0 *0 *0 0 23|0 32.5 71.4 0 18.3G 18.1G 0|0 0|32 63m 21k 34 09:31:12
|
localhost:27017 736 *0 *0 *0 0 2|0 29.5 71.4 0 18.3G 18.1G 0|0 0|32 404k 19k 34 09:31:13
|
localhost:27017 427072 *0 *0 *0 0 40|0 27.9 72.0 0 18.3G 18.1G 0|0 0|32 216m 22k 34 09:31:14
|
localhost:27017 16032 *0 *0 *0 0 3|0 25.7 72.0 0 18.3G 18.1G 0|0 0|32 8m 19k 34 09:31:15
|
localhost:27017 64 *0 *0 *0 0 1|0 22.6 72.0 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:16
|
localhost:27017 245052 *0 *0 *0 0 24|0 20.7 72.5 0 18.3G 18.1G 0|1 0|29 124m 21k 34 09:31:17
|
localhost:27017 389516 *0 *0 *0 0 40|0 19.3 73.1 0 18.3G 18.1G 0|0 0|32 197m 22k 34 09:31:18
|
localhost:27017 36 *0 *0 *0 0 1|0 16.1 73.2 0 18.3G 18.1G 0|0 0|32 51k 19k 34 09:31:19
|
localhost:27017 296176 *0 *0 *0 0 30|0 13.7 73.3 0 18.3G 18.1G 0|3 0|27 149m 21k 34 09:31:20
|
localhost:27017 84224 *0 *0 *0 0 17|0 11.9 73.5 0 18.3G 18.1G 0|0 0|32 43m 20k 34 09:31:21
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 *0 *0 *0 *0 0 1|0 9.2 73.5 0 18.3G 18.1G 0|0 0|32 79b 19k 34 09:31:22
|
localhost:27017 169140 *0 *0 *0 0 17|0 7.0 72.0 0 18.3G 18.1G 0|1 0|30 85m 20k 34 09:31:23
|
localhost:27017 1520928 *0 *0 *0 0 150|0 11.1 76.1 0 18.3G 18.1G 0|0 0|32 769m 33k 34 09:31:24
|
localhost:27017 1473232 *0 *0 *0 0 148|0 16.0 80.6 0 18.3G 18.1G 0|0 0|32 745m 33k 34 09:31:25
|
localhost:27017 1305796 *0 *0 *0 0 136|0 20.5 83.8 0 18.3G 18.1G 0|0 0|32 659m 31k 34 09:31:26
|
localhost:27017 1341060 *0 *0 *0 0 128|0 24.5 86.9 0 18.3G 18.1G 0|1 0|31 678m 31k 34 09:31:27
|
localhost:27017 1589312 *0 *0 *0 0 161|0 29.8 89.6 0 18.3G 18.1G 0|0 0|32 803m 34k 34 09:31:28
|
localhost:27017 1575536 *0 *0 *0 0 163|0 35.2 88.3 0 18.3G 18.1G 0|0 0|29 796m 34k 34 09:31:29
|
localhost:27017 1526688 *0 *0 *0 0 150|0 40.3 86.7 0 18.3G 18.1G 0|0 0|32 772m 33k 34 09:31:30
|
localhost:27017 1491204 *0 *0 *0 0 153|0 45.5 86.1 0 18.3G 18.1G 0|0 0|28 753m 33k 34 09:31:31
|
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time
|
localhost:27017 1523180 *0 *0 *0 0 155|0 50.5 83.1 0 18.3G 18.1G 0|2 0|26 770m 33k 34 09:31:32
|
localhost:27017 1455484 *0 *0 *0 0 148|0 55.1 83.7 0 18.3G 18.1G 0|0 0|31 735m 32k 34 09:31:33
|
localhost:27017 1577108 *0 *0 *0 0 157|0 60.4 83.6 0 18.3G 18.1G 0|1 0|30 797m 33k 34 09:31:34
|
localhost:27017 1599044 *0 *0 *0 0 166|0 66.0 82.5 0 18.3G 18.1G 0|4 0|25 808m 34k 34 09:31:35
|
localhost:27017 1626592 *0 *0 *0 0 160|0 71.6 83.4 0 18.3G 18.1G 0|0 0|28 822m 34k 34 09:31:36
|
localhost:27017 1557572 *0 *0 *0 0 152|0 77.0 85.1 0 18.3G 18.1G 0|1 0|28 787m 33k 34 09:31:37
|
localhost:27017 1516272 *0 *0 *0 0 156|0 81.7 83.9 0 18.3G 18.1G 0|3 0|28 766m 33k 34 09:31:38
|
localhost:27017 1475556 *0 *0 *0 0 145|0 83.2 85.4 0 18.3G 18.1G 0|0 0|30 745m 32k 34 09:31:39
|
localhost:27017 1373336 *0 *0 *0 0 142|0 83.7 85.9 0 18.3G 18.1G 0|1 0|30 694m 32k 34 09:31:40
|
localhost:27017 1429136 *0 *0 *0 0 149|0 84.3 86.4 0 18.3G 18.1G 0|2 0|29 722m 33k 34 09:31:41
|
|
Generated at Thu Feb 08 03:58:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.