Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: 3.2.12
Component/s: Networking, Sharding
Labels:
- 3.2
- ismaster
- mongos
- timeout
Environment:
CentoOS 6
MongoDB version 3.2.12

Operating System:
ALL
Steps To Reproduce:

Hide

Create a 3.2.12 sharded cluster with a single mongos (having 6 CPU), config servers on SCCC and 10 shards - 3 node replica set.
Using sysbench-mongodb create six test collections sbtest<1-6> 40.000.000 docs each and shard on {_id:hashed}
After balancing complete, execute four instances of sbench-mongodb against the mongos. I am running sbench-mongodb using defaults but NUM_WRITER_THREADS which i change it to 256.
After few minutes ASIO timeouts will start appearing on the mongos log.

Show
Create a 3.2.12 sharded cluster with a single mongos (having 6 CPU), config servers on SCCC and 10 shards - 3 node replica set. Using sysbench-mongodb create six test collections sbtest<1-6> 40.000.000 docs each and shard on {_id:hashed} After balancing complete, execute four instances of sbench-mongodb against the mongos. I am running sbench-mongodb using defaults but NUM_WRITER_THREADS which i change it to 256. After few minutes ASIO timeouts will start appearing on the mongos log.
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I am testing 3.2.12 on a 10 nodes sharded cluster (using sysbench-mongodb) and I am getting a weird behavior. Whenever using mongos default settings I am receiving random ASIO timeouts for

{ isMaster: 1 }

command from different connection pools.

I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-2-0] Failed to connect to (node) - ExceededTimeLimit: Operation timed out
D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-2-0] Failed to execute command: RemoteCommand 23628777 -- target:(node) db:admin cmd:{ isMaster: 1 } reason: ExceededTimeLimit: Operation timed out

When I set "taskExecutorPoolSize"=1, which I believe set a single connection pool, I am not getting the above errors.

My mongos has 6 CPUs so I assume it creates 6 connection pools with defaults. Using a smaller value like "taskExecutorPoolSize"=2 reduces the timeouts so it seems the more connection pools I use the more timeouts I get during the benchmark.

I am trying to understand what may cause the above behavior.

Thanks in advance,
Antonis

Assignee:: Unassigned
Reporter:: Antonis Giannopoulos
Participants:: Antonis Giannopoulos, Kelsey Schubert
Votes:: 1 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Feb 19 2017 11:21:19 PM UTC
Updated:: Feb 21 2017 04:10:42 PM UTC
Resolved:: Feb 21 2017 04:10:42 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates