Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27751

Mongod does not scale beyond 30 threads

    • Service Arch
    • ALL
    • Hide

      We run Mongod on one socket (11 core, 88 threads with SMT8), and YCSB on another socket (11 YCSB instances to drive the workload, workload is 50% read and 50 update)

      Show
      We run Mongod on one socket (11 core, 88 threads with SMT8), and YCSB on another socket (11 YCSB instances to drive the workload, workload is 50% read and 50 update)

      We have been testing our new 22 core Power machine.
      I would like to seek your advice on some problems I encountered.

      As usual, It is a two socket system, 11 cores in each socket,
      We run Mongod on one socket (11 core, 88 threads with SMT8), and YCSB on another socket (11 YCSB instances to drive the workload, workload is 50% read and 50 update)
      Our highest throughput is 328179.69 ops/sec keeping the update latency under 500us, However there is about 55% CPU is not utilized on the Mongod socket.
      By looking into the ftrace, looks like the performance is gated by locks since we saw the mongod threads slept during futex operations.
      So I tried to use multiple collections as well as multiple databases for each YCSB instance, instead of all 11 YCSB instances use one collection, but there was no improve in performance, rather it decreased a bit
      Multiple collection : 290699.56 ops/sec
      Multiple database : 305280.25 ops/sec

      The mongotop did report that using multiple collection or database has significantly reduced the query time, but I am wondering why it does reflect to the performance, I am thinking there might be a lock in the mongod level which multiple instance or database could not break down. If it is the case what data I should collect in mongodb side to confirm that?

      Mongotop for multiple collection

      ns total read write 2017-01-17T12:31:23-05:00
      ycsb.usertable2 4589ms 1395ms 3193ms
      ycsb.usertable7 3947ms 1222ms 2724ms
      ycsb.usertable3 3945ms 1209ms 2736ms
      ycsb.usertable1 3942ms 1190ms 2751ms
      ycsb.usertable9 3934ms 1198ms 2735ms
      ycsb.usertable8 3925ms 1232ms 2692ms
      ycsb.usertable5 3918ms 1180ms 2738ms
      ycsb.usertable4 3892ms 1195ms 2697ms
      ycsb.usertable11 3864ms 1195ms 2669ms
      ycsb.usertable10 3852ms 1177ms 2675ms

      ns total read write 2017-01-17T12:31:24-05:00
      ycsb.usertable9 4042ms 1268ms 2774ms
      ycsb.usertable2 4037ms 1212ms 2825ms
      ycsb.usertable11 4005ms 1215ms 2789ms
      ycsb.usertable4 3973ms 1237ms 2736ms
      ycsb.usertable7 3953ms 1213ms 2740ms
      ycsb.usertable6 3951ms 1204ms 2747ms
      ycsb.usertable10 3950ms 1219ms 2730ms
      ycsb.usertable8 3945ms 1223ms 2722ms
      ycsb.usertable5 3944ms 1197ms 2747ms
      ycsb.usertable3 3941ms 1182ms 2758ms

      Monogtop for multiple database
      ns total read write 2017-01-17T13:15:11-05:00
      ycsb11.usertable11 3794ms 1168ms 2626ms
      ycsb2.usertable2 3790ms 1137ms 2652ms
      ycsb10.usertable10 3787ms 1124ms 2663ms
      ycsb9.usertable9 3782ms 1152ms 2629ms
      ycsb1.usertable1 3765ms 1129ms 2635ms
      ycsb5.usertable5 3757ms 1132ms 2625ms
      ycsb8.usertable8 3742ms 1136ms 2605ms
      ycsb4.usertable4 3736ms 1137ms 2598ms
      ycsb7.usertable7 3703ms 1145ms 2557ms
      ycsb3.usertable3 3689ms 1131ms 2558ms

      ns total read write 2017-01-17T13:15:12-05:00
      ycsb10.usertable10 3879ms 1165ms 2713ms
      ycsb2.usertable2 3877ms 1190ms 2686ms
      ycsb8.usertable8 3866ms 1155ms 2710ms
      ycsb9.usertable9 3858ms 1182ms 2675ms
      ycsb4.usertable4 3840ms 1151ms 2689ms
      ycsb11.usertable11 3835ms 1156ms 2678ms
      ycsb5.usertable5 3821ms 1172ms 2648ms
      ycsb7.usertable7 3786ms 1150ms 2635ms
      ycsb1.usertable1 3785ms 1159ms 2626ms
      ycsb3.usertable3 3785ms 1151ms 2633ms

      Mongotop for 1 collection
      ns total read write 2017-01-17T12:27:35-05:00
      ycsb.usertable 16375ms 5393ms 10981ms
      admin.system.roles 0ms 0ms 0ms
      admin.system.version 0ms 0ms 0ms
      local.startup_log 0ms 0ms 0ms
      local.system.replset 0ms 0ms 0ms

      ns total read write 2017-01-17T12:27:36-05:00
      ycsb.usertable 16267ms 5379ms 10887ms
      admin.system.roles 0ms 0ms 0ms
      admin.system.version 0ms 0ms 0ms
      local.startup_log 0ms 0ms 0ms
      local.system.replset 0ms 0ms 0ms

      ns total read write 2017-01-17T12:27:37-05:00
      ycsb.usertable 16319ms 5380ms 10938ms
      admin.system.roles 0ms 0ms 0ms
      admin.system.version 0ms 0ms 0ms
      local.startup_log 0ms 0ms 0ms
      local.system.replset 0ms 0ms 0ms

      ns total read write 2017-01-17T12:27:38-05:00
      ycsb.usertable 16691ms 5469ms 11221ms
      admin.system.roles 0ms 0ms 0ms
      admin.system.version 0ms 0ms 0ms
      local.startup_log 0ms 0ms 0ms
      local.system.replset 0ms 0ms 0ms

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            calvins@us.ibm.com Calvin Sze
            Votes:
            3 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated: