|
during long insert session, saw a few long pause for around 100sec each
here is capture of one from sysbench output (look at the "int ips", which is the last column)
Thread[Thread-2,5,main]430,914,000 inserts : 4,500 seconds : cum ips=95,757.09 : int ips=799.20
|
Thread[Thread-2,5,main]430,918,000 inserts : 4,505 seconds : cum ips=95,651.60 : int ips=799.20
|
Thread[Thread-2,5,main]430,925,000 inserts : 4,510 seconds : cum ips=95,547.00 : int ips=1,398.60
|
Thread[Thread-2,5,main]430,927,000 inserts : 4,515 seconds : cum ips=95,441.51 : int ips=399.52
|
Thread[Thread-2,5,main]430,928,000 inserts : 4,520 seconds : cum ips=95,336.05 : int ips=199.80
|
Thread[Thread-2,5,main]430,928,000 inserts : 4,525 seconds : cum ips=95,230.60 : int ips=0.00
|
Thread[Thread-2,5,main]430,928,000 inserts : 4,530 seconds : cum ips=95,127.49 : int ips=0.00
|
Thread[Thread-2,5,main]430,929,000 inserts : 4,535 seconds : cum ips=95,022.70 : int ips=199.76
|
Thread[Thread-2,5,main]430,931,000 inserts : 4,540 seconds : cum ips=94,918.37 : int ips=399.52
|
Thread[Thread-2,5,main]430,931,000 inserts : 4,545 seconds : cum ips=94,813.82 : int ips=0.00
|
Thread[Thread-2,5,main]430,931,000 inserts : 4,550 seconds : cum ips=94,709.53 : int ips=0.00
|
Thread[Thread-2,5,main]430,931,000 inserts : 4,555 seconds : cum ips=94,605.44 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,560 seconds : cum ips=94,502.24 : int ips=599.28
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,565 seconds : cum ips=94,398.63 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,570 seconds : cum ips=94,295.23 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,575 seconds : cum ips=94,192.05 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,580 seconds : cum ips=94,089.10 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,585 seconds : cum ips=93,986.37 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,590 seconds : cum ips=93,883.89 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,595 seconds : cum ips=93,781.61 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,600 seconds : cum ips=93,679.57 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,605 seconds : cum ips=93,577.76 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,610 seconds : cum ips=93,476.16 : int ips=0.00
|
Thread[Thread-2,5,main]430,934,000 inserts : 4,615 seconds : cum ips=93,376.79 : int ips=0.00
|
Thread[Thread-2,5,main]431,061,000 inserts : 4,620 seconds : cum ips=93,302.52 : int ips=25,223.44
|
Thread[Thread-2,5,main]431,061,000 inserts : 4,625 seconds : cum ips=93,201.55 : int ips=0.00
|
Thread[Thread-2,5,main]431,186,000 inserts : 4,630 seconds : cum ips=93,127.80 : int ips=24,975.02
|
Thread[Thread-2,5,main]431,536,000 inserts : 4,635 seconds : cum ips=93,102.75 : int ips=69,930.07
|
Thread[Thread-2,5,main]432,127,000 inserts : 4,640 seconds : cum ips=93,129.11 : int ips=117,401.67
|
Thread[Thread-2,5,main]432,648,000 inserts : 4,645 seconds : cum ips=93,140.91 : int ips=104,075.11
|
Thread[Thread-2,5,main]433,422,000 inserts : 4,650 seconds : cum ips=93,208.77 : int ips=157,253.15
|
Thread[Thread-2,5,main]433,977,000 inserts : 4,655 seconds : cum ips=93,227.76 : int ips=110,866.96
|
observation
- check the CPU flame graph, majority time spend in __ckpt_server & __evict_server, hot functions are (from perf.data.pause_capture1)
+ 40.56% 12239 mongod mongod [.] __wt_tree_walk
|
+ 13.59% 4097 mongod mongod [.] __evict_server
|
+ 6.09% 1859 mongod mongod [.] __wt_hazard_clear
|
+ 3.36% 1952 mongod mongod [.] __wt_row_search
|
+ 2.70% 817 mongod mongod [.] __rec_row_leaf
|
+ 2.62% 803 mongod mongod [.] __rec_row_leaf_insert
|
+ 2.06% 659 mongod mongod [.] __wt_hazard_set
|
+ 1.69% 564 mongod mongod [.] snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int)
|
+ 0.92% 352 mongod mongod [.] __wt_page_in_func
|
+ 0.88% 367 mongod mongod [.] __wt_row_modify
|
+ 0.84% 508 mongod libc-2.12.so [.] msort_with_tmp
|
+ 0.63% 297 mongod libc-2.12.so [.] memcpy
|
+ 0.61% 279 mongod mongod [.] tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**)
|
+ 0.60% 189 mongod mongod [.] __wt_page_out
|
+ 0.59% 344 mongod [kernel.kallsyms] [k] update_curr
|
+ 0.51% 665 mongod [kernel.kallsyms] [k] native_write_msr_safe
|
+ 0.50% 154 mongod mongod [.] __rec_cell_build_leaf_key
|
- mongo::PortMessageServer::handleIncomingMsg was pushed to only 0.9%, on normal run, it could take more than 75% of the time.
attachment and details.
attachments:
flame graph
- perf.data.pause_capture1.svg : 10sec perf count during pause
- perf.data.pause_capture2.svg : 20sec perf count during pause
perf.data
- perf.data.pause_capture1.gz
- perf.data.pause_capture2.gz
- perf.report.pause_capture2.txt : report in plain text with expanded stack trace.
|