[SERVER-30526] Pre-populating connections of TaskExecutorPool (mongos) Created: 07/Aug/17  Updated: 08/Feb/23  Resolved: 10/Aug/17

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: 3.4.4
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: 아나 하리 Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Client-TimeoutCountGraph.png     PNG File MongoDB-ConnectionGraph.png    
Issue Links:
Duplicate
duplicates SERVER-26720 Need to prepare initial connection be... Backlog
Participants:

 Description   

After examining connections between mongos and mongod server, I've found these connections are created when they need even I added setParameter.ShardingTaskExecutorPoolMinSize on mongos.conf.

Mongos does not make connections until mongos receives some requests from client.
A lot of web service, we deploy new code with rolling restart of web servers (sometimes with restarting mongos also). Once we have restarted web server, user requests are come in with same amount of requests before restart. But mongos does't have any connections to mongod server, so mongos have to prepare lots of connections at once.

This leads lots of errors (like com.mongodb.MongoSocketReadTimeoutException and com.mongodb.MongoWaitQueueFullException) on client side. There's no easy way to warming-up connections between mongos and mongod because mongos maintains multi-layer connection pools (TaskExecutorPool and SpecificPool).

So, I think it would be better that mongos prepare connections as many as "setParameter.ShardingTaskExecutorPoolMinSize" when they start. And then we can do rolling-restart app/web server and mongos smoothly.

I've attached mongod connection graph(3 shard cluster, each line is for primary member of each shard) and client timeout (over 3 seconds) error graph.
You can see the timeout error and mongod connection spikes are moving together from two graphs.

And I hope this could be fixed ASAP, because critical in our service. If this is not fixed, we need to deploy new service code at night. I think other users also experience the same case if they are using rolling deployment also.

Regards,
Matt.



 Comments   
Comment by Kelsey Schubert [ 10/Aug/17 ]

Thanks for the clarification, matt.lee. I'm closing this ticket in favor of SERVER-26720, which is already on our backlog.

Kind regards,
Thomas

Comment by 아나 하리 [ 08/Aug/17 ]

Hi Thomas.

You right. I had forgotten that.
The only difference is MongoDB version (3.2 vs 3.4).

Thanks.

Comment by Kelsey Schubert [ 07/Aug/17 ]

Hi matt.lee,

Would you please clarify how this feature request differs from SERVER-26720?

Thank you,
Thomas

Generated at Thu Feb 08 04:24:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.