[SERVER-18242] mongos lock contention in mongo::Shard::findIfExists Created: 28/Apr/15  Updated: 19/Sep/15  Resolved: 01/May/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.1.1, 3.1.2
Fix Version/s: 3.1.3

Type: Bug Priority: Critical - P2
Reporter: Rui Zhang (Inactive) Assignee: Daniel Alabi
Resolution: Done Votes: 0
Labels: 32qa, FT
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Participants:

 Description   

during test scalability of mongos/shard, found performance regression of mongos due to lock sleep

the traffic is YCSB

  • single doc insert
  • shard with {_id: hashed}
  • total 7 shards, each is 3 member replica set
  • writer with 64 threads
  • only change mongos binary, 3.1.1/3.1.2 show lower throughput.
      3.0.2 3.1.2
    throughput 43451 7065

from off-cpu perf analysis

## with master 3.1.2 (c8e2c0546b30621f78cd436d96714cc064bbb8a7)
 
-36.26%-- futex_wait_queue_me
    futex_wait
    do_futex
    sys_futex
    system_call_fastpath
    |
    |--100.00%-- __lll_lock_wait
    |     |
    |     |--85.34%-- mongo::StaticShardInfo::reload()
    |     |    |
    |     |    |--100.00%-- mongo::Shard::findIfExists(std::string const&)
    |     |    |     mongo::DBClientShardResolver::chooseWriteHost(std::string const&, mongo::ConnectionString*) const
    |     |    |     mongo::BatchWriteExec::executeBatch(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*
    |     |    |     mongo::ClusterWriter::write(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*)
    |     |    |     |
    |     |    |      --100.00%-- mongo::(anonymous namespace)::ClusterWriteCmd::run(mongo::OperationContext*, std::string c
    |     |    |           mongo::Command::execCommandClientBasic(mongo::OperationContext*, mongo::Command*, mongo::Cli
    |     |    |           mongo::Command::runAgainstRegistered(char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, i
    |     |    |           mongo::Strategy::clientCommandOp(mongo::Request&)
    |     |    |           mongo::Request::process(int)
    |     |    |           mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo:
    |     |    |           mongo::PortMessageServer::handleIncomingMsg(void*)
    |     |    |           start_thread
    |     |     --0.00%-- [...]
    |     |
    |     |--14.42%-- mongo::Shard::findIfExists(std::string const&)
    |     |    mongo::DBClientShardResolver::chooseWriteHost(std::string const&, mongo::ConnectionString*) const
    |     |    mongo::BatchWriteExec::executeBatch(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*)
    |     |    mongo::ClusterWriter::write(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*)
    |     |    |
    |     |     --100.00%-- mongo::(anonymous namespace)::ClusterWriteCmd::run(mongo::OperationContext*, std::string const&, mong
    |     |          mongo::Command::execCommandClientBasic(mongo::OperationContext*, mongo::Command*, mongo::ClientBasic&,
    |     |          mongo::Command::runAgainstRegistered(char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, int)
    |     |          mongo::Strategy::clientCommandOp(mongo::Request&)
    |     |          mongo::Request::process(int)
    |     |          mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*
    |     |          mongo::PortMessageServer::handleIncomingMsg(void*)
    |     |          start_thread
    |      --0.24%-- [...]
     --0.00%-- [...]



 Comments   
Comment by Githook User [ 01/May/15 ]

Author:

{u'username': u'alabid', u'name': u'Daniel Alabi', u'email': u'alabidan@gmail.com'}

Message: SERVER-18242 Only reload shard cache if shard not found in cache
Branch: master
https://github.com/mongodb/mongo/commit/665e233f222d37df462c96b3912e84fc6e2ca0e0

Comment by Andy Schwerin [ 28/Apr/15 ]

Tentatively slotted to 3.1.4.

Generated at Thu Feb 08 03:47:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.