Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13663

Perf regression for workload that updates 1 row with 1 or many clients

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 2.6.0
    • Fix Version/s: 3.5 Desired
    • Component/s: Performance, Write Ops
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      I used my fork of jmongosysbench - https://github.com/mdcallag/sysbench-mongodb as my changes have yet to be pushed upstream.

      This is the only option I set for mongod:
      journalCommitInterval=300

      Then I load 400M rows into 1 collection (in the blog post I used 8 collections, but the repro case here only needs 1):
      bash run.sh 1 400000000 1 5 yes no $( date +%s )

      Then I run the test first for one client thread:
      t=10; dop=1; bash run.sh 1 1 $dop $t no yes $( date +%s )

      And then for 32 client threads:
      t=10; dop=32; bash run.sh 1 1 $dop $t no yes $( date +%s )

      Show
      I used my fork of jmongosysbench - https://github.com/mdcallag/sysbench-mongodb as my changes have yet to be pushed upstream. This is the only option I set for mongod: journalCommitInterval=300 Then I load 400M rows into 1 collection (in the blog post I used 8 collections, but the repro case here only needs 1): bash run.sh 1 400000000 1 5 yes no $( date +%s ) Then I run the test first for one client thread: t=10; dop=1; bash run.sh 1 1 $dop $t no yes $( date +%s ) And then for 32 client threads: t=10; dop=32; bash run.sh 1 1 $dop $t no yes $( date +%s )

      Description

      My results are at http://smalldatum.blogspot.com/2014/04/biebermarks.html. Using the same hardware for the test and 2.4.9 versus 2.6 (rc2 or release) I get more updates/second with 2.4.9 versus 2.6 – 1.22X more at 32 concurrent clients and 1.11X more updates/second with 1 client.

      1. 249.pdf
        21 kB
        Mark Callaghan
      2. 260.pdf
        21 kB
        Mark Callaghan
      3. run.sh
        5 kB
        Mark Callaghan

        Issue Links

          Activity

          Hide
          mdcallag Mark Callaghan added a comment -

          Comparing 2.4.9 with 2.6.0 release using "top" with 1 client thread

          • 2.4.9 does 1.15X more updates/second
          • 2.4.9 java client uses 1.13X more CPU
          • 2.4.9 mongod uses 0.92X (less) the CPU compared to 2.6.0

          So I see a similar pattern at 1 thread as at 32 threads, but maybe the regression is worse with 32 threads. So I will focus on profiling at 1 thread first to see what uses more CPU.

          Show
          mdcallag Mark Callaghan added a comment - Comparing 2.4.9 with 2.6.0 release using "top" with 1 client thread 2.4.9 does 1.15X more updates/second 2.4.9 java client uses 1.13X more CPU 2.4.9 mongod uses 0.92X (less) the CPU compared to 2.6.0 So I see a similar pattern at 1 thread as at 32 threads, but maybe the regression is worse with 32 threads. So I will focus on profiling at 1 thread first to see what uses more CPU.
          Hide
          mdcallag Mark Callaghan added a comment - - edited

          Will provide more details on Monday, but using same database with 8 collections and concurrent point queries I see a larger regression in 2.6. Will repeat test with 1 thread (client). At 8 threads I get 94k QPS with 2.4.9 and 62k QPS with 2.6.0 release. Per "top" the Java client uses 1.5X more CPU with 2.4.9 than with 2.6.0 as expected. The context switch rate is about 1.4X more with 2.4.9 which is as expected and suggests this isn't a mutex contention issue. But mongod for 2.4.9 does not use 1.5X more CPU, instead 2.6.0 is using 1.2X more CPU even though 2.4.9 is doing 1.5X more QPS.

          For this test, regression at 8 & 16 threads was worse than at 32 and 40 threads. Test host has 40 cores with hyperthread enabled and peak QPS was at 32 threads.

          Top-n sources of CPU per "perf" on Linux...

          for 2.4.9

               6.19%   mongod  mongod               [.] (anonymous namespace)::cpp_alloc(unsigned long, bool)                                                                                                                                                                                                                                                                                                                                                                                                                            
               2.99%   mongod  libpthread-2.13.so   [.] pthread_mutex_lock                                                                                                                                                                                                                                                                                                                                                                                                                                                               
               2.58%   mongod  mongod               [.] _ZN12_GLOBAL__N_121do_free_with_callbackEPvPFvS0_E.constprop.41                                                                                                                                                                                                                                                                                                                                                                                                                  
               2.32%   mongod  mongod               [.] mongo::KeyV1::woCompare(mongo::KeyV1 const&, mongo::Ordering const&) const                                                                                                                                                                                                                                                                                                                                                                                                       
               2.04%   mongod  mongod               [.] mongo::ps::Rolling::access(unsigned long, short, bool)                                                                                                                                                                                                                                                                                                                                                                                                                           
               1.58%   mongod  libc-2.13.so         [.] __strlen_sse42                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
               1.52%   mongod  libstdc++.so.6.0.16  [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()                                                                                                                                                                                                                                                                                                                                                                                          
               1.49%   mongod  mongod               [.] boost::shared_ptr<mongo::Projection> boost::make_shared<mongo::Projection>()                                                                                                                                                                                                                                                                                                                                                                                                     
               1.48%   mongod  mongod               [.] mongo::BtreeBucket<mongo::BtreeData_V1>::find(mongo::IndexDetails const&, mongo::KeyV1 const&, mongo::DiskLoc const&, mongo::Ordering const&, int&, bool) const                                                                                                                                                                                                                                                                                                                  
               1.48%   mongod  mongod               [.] boost::detail::shared_count::~shared_count()                                                                                                                                                                                                                                                                                                                                                                                                                                     
               1.44%   mongod  mongod               [.] mongo::BucketBasics<mongo::BtreeData_V1>::KeyNode::KeyNode(mongo::BucketBasics<mongo::BtreeData_V1> const&, mongo::__KeyNode<mongo::DiskLoc56Bit> const&)                                                                                                                                                                                                                                                                                                                        
               1.30%   mongod  libc-2.13.so         [.] __memcpy_ssse3                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
               1.27%   mongod  mongod               [.] tc_malloc                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
               1.22%   mongod  [ip_tables]          [k] ipt_do_table                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
               1.20%   mongod  libstdc++.so.6.0.16  [.] _ZNSs4_Rep10_M_disposeERKSaIcE.part.2                                                                                                                                                                                                                                                                                                                                                                                                                                            
               1.03%   mongod  mongod               [.] mongo::Projection::init(mongo::BSONObj const&)     

          And for 2.6.0 release

          6.01%   mongod  mongod               [.] operator new(unsigned long) 
          2.95%   mongod  libpthread-2.13.so   [.] pthread_mutex_lock
          2.67%   mongod  mongod               [.] mongo::KeyV1::toBson() const
          2.43%   mongod  mongod               [.] operator delete(void*)
          1.68%   mongod  libc-2.13.so         [.] __strlen_sse42 
          1.51%   mongod  libstdc++.so.6.0.16  [.] _ZNSs4_Rep10_M_disposeERKSaIcE.part.2
          1.33%   mongod  mongod               [.] boost::intrusive_ptr<mongo::BSONObj::Holder>::~intrusive_ptr()
          1.32%   mongod  libc-2.13.so         [.] __memcpy_ssse3
          1.24%   mongod  mongod               [.] mongo::BucketBasics<mongo::BtreeData_V1>::KeyNode::KeyNode(mongo::BucketBasics<mongo::BtreeData_V1> const&, mongo::__KeyNode<mongo::DiskLoc56Bit> const&)
          1.16%   mongod  mongod               [.] tc_malloc
          1.09%   mongod  libc-2.13.so         [.] vfprintf
          1.03%   mongod  mongod               [.] mongo::BSONElement::size() const
          0.99%   mongod  libstdc++.so.6.0.16  [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
          0.98%   mongod  libc-2.13.so         [.] __memcmp_sse4_1
          0.94%   mongod  libstdc++.so.6.0.16  [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
          0.85%   mongod  mongod               [.] mongo::QueryPlanner::plan(mongo::CanonicalQuery const&, mongo::QueryPlannerParams const&, std::vector<mongo::QuerySolution*, std::allocator<mongo::QuerySolution*> >*) 
          0.84%   mongod  mongod               [.] mongo::newRunQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&)
          0.81%   mongod  mongod               [.] mongo::ParsedProjection::make(mongo::BSONObj const&, mongo::MatchExpression const*, mongo::ParsedProjection**)
          0.76%   mongod  mongod               [.] tc_free                               

          Show
          mdcallag Mark Callaghan added a comment - - edited Will provide more details on Monday, but using same database with 8 collections and concurrent point queries I see a larger regression in 2.6. Will repeat test with 1 thread (client). At 8 threads I get 94k QPS with 2.4.9 and 62k QPS with 2.6.0 release. Per "top" the Java client uses 1.5X more CPU with 2.4.9 than with 2.6.0 as expected. The context switch rate is about 1.4X more with 2.4.9 which is as expected and suggests this isn't a mutex contention issue. But mongod for 2.4.9 does not use 1.5X more CPU, instead 2.6.0 is using 1.2X more CPU even though 2.4.9 is doing 1.5X more QPS. For this test, regression at 8 & 16 threads was worse than at 32 and 40 threads. Test host has 40 cores with hyperthread enabled and peak QPS was at 32 threads. Top-n sources of CPU per "perf" on Linux... for 2.4.9 6.19% mongod mongod [.] (anonymous namespace)::cpp_alloc(unsigned long, bool) 2.99% mongod libpthread-2.13.so [.] pthread_mutex_lock 2.58% mongod mongod [.] _ZN12_GLOBAL__N_121do_free_with_callbackEPvPFvS0_E.constprop.41 2.32% mongod mongod [.] mongo::KeyV1::woCompare(mongo::KeyV1 const&, mongo::Ordering const&) const 2.04% mongod mongod [.] mongo::ps::Rolling::access(unsigned long, short, bool) 1.58% mongod libc-2.13.so [.] __strlen_sse42 1.52% mongod libstdc++.so.6.0.16 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() 1.49% mongod mongod [.] boost::shared_ptr<mongo::Projection> boost::make_shared<mongo::Projection>() 1.48% mongod mongod [.] mongo::BtreeBucket<mongo::BtreeData_V1>::find(mongo::IndexDetails const&, mongo::KeyV1 const&, mongo::DiskLoc const&, mongo::Ordering const&, int&, bool) const 1.48% mongod mongod [.] boost::detail::shared_count::~shared_count() 1.44% mongod mongod [.] mongo::BucketBasics<mongo::BtreeData_V1>::KeyNode::KeyNode(mongo::BucketBasics<mongo::BtreeData_V1> const&, mongo::__KeyNode<mongo::DiskLoc56Bit> const&) 1.30% mongod libc-2.13.so [.] __memcpy_ssse3 1.27% mongod mongod [.] tc_malloc 1.22% mongod [ip_tables] [k] ipt_do_table 1.20% mongod libstdc++.so.6.0.16 [.] _ZNSs4_Rep10_M_disposeERKSaIcE.part.2 1.03% mongod mongod [.] mongo::Projection::init(mongo::BSONObj const&) And for 2.6.0 release 6.01% mongod mongod [.] operator new(unsigned long) 2.95% mongod libpthread-2.13.so [.] pthread_mutex_lock 2.67% mongod mongod [.] mongo::KeyV1::toBson() const 2.43% mongod mongod [.] operator delete(void*) 1.68% mongod libc-2.13.so [.] __strlen_sse42 1.51% mongod libstdc++.so.6.0.16 [.] _ZNSs4_Rep10_M_disposeERKSaIcE.part.2 1.33% mongod mongod [.] boost::intrusive_ptr<mongo::BSONObj::Holder>::~intrusive_ptr() 1.32% mongod libc-2.13.so [.] __memcpy_ssse3 1.24% mongod mongod [.] mongo::BucketBasics<mongo::BtreeData_V1>::KeyNode::KeyNode(mongo::BucketBasics<mongo::BtreeData_V1> const&, mongo::__KeyNode<mongo::DiskLoc56Bit> const&) 1.16% mongod mongod [.] tc_malloc 1.09% mongod libc-2.13.so [.] vfprintf 1.03% mongod mongod [.] mongo::BSONElement::size() const 0.99% mongod libstdc++.so.6.0.16 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() 0.98% mongod libc-2.13.so [.] __memcmp_sse4_1 0.94% mongod libstdc++.so.6.0.16 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) 0.85% mongod mongod [.] mongo::QueryPlanner::plan(mongo::CanonicalQuery const&, mongo::QueryPlannerParams const&, std::vector<mongo::QuerySolution*, std::allocator<mongo::QuerySolution*> >*) 0.84% mongod mongod [.] mongo::newRunQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) 0.81% mongod mongod [.] mongo::ParsedProjection::make(mongo::BSONObj const&, mongo::MatchExpression const*, mongo::ParsedProjection**) 0.76% mongod mongod [.] tc_free
          Hide
          eliot Eliot Horowitz added a comment -

          Seems a big part of the issue is new write commands.
          If you use the old version of the java driver (r2.11.4) with 2.6 I see much better performance.
          Need to dig into why.

          Show
          eliot Eliot Horowitz added a comment - Seems a big part of the issue is new write commands. If you use the old version of the java driver (r2.11.4) with 2.6 I see much better performance. Need to dig into why.
          Hide
          mdcallag Mark Callaghan added a comment -

          I used mongo-java-driver-2.11.4.jar for all of my tests. If it helps I will repeat with another version.

          Show
          mdcallag Mark Callaghan added a comment - I used mongo-java-driver-2.11.4.jar for all of my tests. If it helps I will repeat with another version.
          Hide
          eliot Eliot Horowitz added a comment -

          I'm seeing an issue with both java driver versions, but much worse with the newest.

          Show
          eliot Eliot Horowitz added a comment - I'm seeing an issue with both java driver versions, but much worse with the newest.
          Hide
          mdcallag Mark Callaghan added a comment -

          Figured out how to make google perftools work, so here is a CPU profile from 1 client thread doing random queries over 1M row table. All data is in the OS filesystem cache – no disk reads. The client ran for 5 minutes and the average QPS is 15662 for 2.4.9 and 10286 for 2.6.0.

          Looks like the new overhead for 2.6 is from the optimizer and maybe the parser.

          Show
          mdcallag Mark Callaghan added a comment - Figured out how to make google perftools work, so here is a CPU profile from 1 client thread doing random queries over 1M row table. All data is in the OS filesystem cache – no disk reads. The client ran for 5 minutes and the average QPS is 15662 for 2.4.9 and 10286 for 2.6.0. Looks like the new overhead for 2.6 is from the optimizer and maybe the parser.
          Hide
          hari.khalsa@10gen.com Hari Khalsa (Inactive) added a comment -

          Thanks for the ticket, Mark Callaghan.

          There is a fast path for find-by-_id that is being used in 2.4.x that is not being used in 2.6.x. I believe this is responsible for some of the regression you're seeing. I've created SERVER-13685 to track this.

          Show
          hari.khalsa@10gen.com Hari Khalsa (Inactive) added a comment - Thanks for the ticket, Mark Callaghan . There is a fast path for find-by-_id that is being used in 2.4.x that is not being used in 2.6.x. I believe this is responsible for some of the regression you're seeing. I've created SERVER-13685 to track this.
          Hide
          ramon.fernandez Ramon Fernandez added a comment -

          Rui Zhang, can we re-test with a recent release (3.0.3 is the latest stable) and see if this is still an issue?

          Thanks,
          Ramón.

          Show
          ramon.fernandez Ramon Fernandez added a comment - Rui Zhang , can we re-test with a recent release (3.0.3 is the latest stable) and see if this is still an issue? Thanks, Ramón.

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated: