Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53899

TaskExecutor CPU fills up instantly

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • ALL
    • Hide

      6 mongos with taskExecutorPoolSize=4 and 8Core

      3 shard with PSSSSH

       

      use YCSB to pressure it:

      ./bin/ycsb run mongodb -P workloads/custom -s -threads 48 -p mongodb.url=xxx

       

      YCSB workload:

      recordcount=100000
      operationcount=100000000
      workload=site.ycsb.workloads.CoreWorkload
      readallfields=true
      readproportion=0.7
      updateproportion=0.1
      insertproportion=0.2
      scanproportion=0
      maxscanlength=1000
      readmodifywriteproportion=0
      insertorder=order
      writeallfields=false
      mongodb.readPreference=secondary
      mongodb.maxconnections=200
      requestdistribution=uniform
      fieldlength=30
      fieldcount=10
      hotspotdatafraction=0.2
      hotspotopnfraction=0.8
      maxexecutiontime=600
      table=pressure

      Show
      6 mongos with taskExecutorPoolSize=4 and 8Core 3 shard with PSSSSH   use YCSB to pressure it: ./bin/ycsb run mongodb -P workloads/custom -s -threads 48 -p mongodb.url=xxx   YCSB workload: recordcount=100000 operationcount=100000000 workload=site.ycsb.workloads.CoreWorkload readallfields=true readproportion=0.7 updateproportion=0.1 insertproportion=0.2 scanproportion=0 maxscanlength=1000 readmodifywriteproportion=0 insertorder=order writeallfields=false mongodb.readPreference=secondary mongodb.maxconnections=200 requestdistribution=uniform fieldlength=30 fieldcount=10 hotspotdatafraction=0.2 hotspotopnfraction=0.8 maxexecutiontime=600 table=pressure

      we have a shardCluster with 6 mongos and 3 shard. 

      Each mongos use 8 Core controlled by cgroup, and taskExecutorPoolSize is 4

      Each shard has 1 primary , 4 secondaries and 1 hidden.

       

      we use ycsb to pressure test it with 48 thread.

      Anything goes OK,But there may be steep drop(both CPU&opcounters) appeared in mongos once or twice for a 10min pressure test,

      When all things go ok, the CPU like:
      S 35.0 0.0 8:35.77 TaskExe.rPool-1
      S 35.0 0.0 8:52.79 TaskExe.rPool-2
      S 35.0 0.0 8:32.40 TaskExe.rPool-3
      S 30.0 0.0 9:18.88 TaskExe.rPool-0

       

      When the steep drop happened,someone TaskExecutor CPU fill up like:

      R 99.9 0.0 8:54.62 TaskExe.rPool-2
      S 10.0 0.0 9:19.35 TaskExe.rPool-0
      S 10.0 0.0 8:36.19 TaskExe.rPool-1
      R 5.0 0.0 8:32.80 TaskExe.rPool-3

       

      pstack result for TaskExecutor with CPU filled up:

      Thread 85 (Thread 0x7f01c6203700 (LWP 129527)):
      #0 0x00007f01cedb86d0 in sha256_block_data_order_avx2 ()
      #1 0x00007f01cedb9935 in SHA256_Update ()
      #2 0x00007f01ced58817 in HMAC_Init_ex ()
      #3 0x00007f01cec04e0f in mongo::SHA256BlockTraits::computeHmac( xxx )

       

      perf top result when CPU full happened: about 7.32% for OPENSSL_cleanse.

       

      There are also slow log appeared in mongos, while secondary's log do not contain slow log.The reason is that the connectionPool in TaskExecutor with CPU filled up has many requests to be sent, the connectionPool stats log is :

      Updating controller for host:port with State: { requests: 19, ready: 0, pending: 2, active: 1, isExpired: false }

      The request continues to grow, and pending is always 2. I think it just the result of TaskExecutor CPU full

       

      There 2 things worth mentioning

      • when we use PSH for every shard ,everything goes ok, mongos CPU just fill up but no steep drop
      • we use taskExecutorPoolSize=8, may ycsb 48 threads goes well(also steep drop sometimes), but 96 threads still have problems

       

      How do I solved this problem?

       

        1. image-2021-01-20-17-44-34-407.png
          image-2021-01-20-17-44-34-407.png
          135 kB
        2. mongos.diagnostic.data.tar.gz
          4.43 MB
        3. conns.png
          conns.png
          335 kB
        4. image-2021-01-21-00-16-26-376.png
          image-2021-01-21-00-16-26-376.png
          122 kB
        5. image-2021-01-22-12-50-18-538.png
          image-2021-01-22-12-50-18-538.png
          117 kB

            Assignee:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Reporter:
            wangxin201492@gmail.com Xin Wang
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: