[SERVER-13825] Close connections for which no thread can be allocated Created: 02/May/14  Updated: 12/Jun/18  Resolved: 29/Aug/17

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 3.5.13

Type: Bug Priority: Major - P3
Reporter: Davide Italiano Assignee: Andrew Morrow (Inactive)
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-35542 Configurable service worker thread st... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Platform 6 07/17/15, Platforms 2017-08-21, Platforms 2017-09-11
Participants:

 Description   

Under some circustances, if too many connections are open, pthread_create() call in mongosMain might fail with EAGAIN.
mongos doesn't properly handle boost::thread_resource_error exception and terminates. Maybe it could just refuse the connection and keep running.

Trace example:

2014-05-01T13:17:57.114-0700 [mongosMain] pthread_create failed: errno:11 Resource temporarily unavailable
2014-05-01T13:17:57.114-0700 [mongosMain] can't create new thread, closing connection
2014-05-01T13:17:57.116-0700 [conn973] DBClientCursor::init call() failed
2014-05-01T13:17:57.116-0700 [mongosMain] connection accepted from 127.0.0.1:43764 #980 (167 connections now open)
2014-05-01T13:17:57.116-0700 [mongosMain] pthread_create failed: errno:11 Resource temporarily unavailable
2014-05-01T13:17:57.116-0700 [mongosMain] can't create new thread, closing connection
2014-05-01T13:17:57.118-0700 [conn861] DBClientCursor::init call() failed
2014-05-01T13:17:57.120-0700 [conn927] DBClientCursor::init call() failed
2014-05-01T13:17:57.127-0700 [conn976] ERROR: Uncaught std::exception: boost::thread_resource_error, terminating
2014-05-01T13:17:57.127-0700 [conn976] dbexit: rc:100



 Comments   
Comment by Githook User [ 29/Aug/17 ]

Author:

{'email': 'acm@mongodb.com', 'username': 'acmorrow', 'name': 'Andrew Morrow'}

Message: SERVER-13825 Close connections for which we can't allocate a thread
Branch: master
https://github.com/mongodb/mongo/commit/b47d1bd254d543a8880ac226ea7daa2d4e2b0967

Comment by Andrew Morrow (Inactive) [ 13/Jul/16 ]

I believe, though i am not sure, that this issue still exists after the transport layer landed. I'd like to see it fixed, but I do not think it is a hard requirement for 3.4, so I've marked it as 3.3 Desired.

Comment by Gugan Sellamuthu [ 27/Feb/15 ]

Check ulimit for the mong user, most of the time it is due to incorrect ulimit settings.

Comment by Davide Italiano [ 05/May/14 ]

No, looks like I can trigger the same problem with a standalone mongod.

Comment by Eric Milkie [ 05/May/14 ]

Are you sure that is from mongosMain? From your log, it looks like an already-established connection (number 976) attempted to spawn a thread. The top-level message handling code terminates the connection if a DBException hits the top level, but for a std::exception it simply calls dbexit(). We could consider catching this boost exception explicitly and terminating the connection; this would affect both mongod and mongos.

Generated at Thu Feb 08 03:33:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.