[SERVER-18960] confused log: pthread_create failed: errno:11 Resource temporarily unavailable Created: 12/Jun/15  Updated: 17/Jun/15  Resolved: 13/Jun/15

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: 2.6.10
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jiangcheng Wu Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

There are some logs as below show up in mongod start progress:

2015-06-13T04:08:01.238+0800 [initandlisten] pthread_create failed: errno:11 Resource temporarily unavailable
2015-06-13T04:08:01.238+0800 [initandlisten] can't create new thread, closing connection
...
2015-06-13T04:08:21.247+0800 [initandlisten] pthread_create failed: errno:11 Resource temporarily unavailable
2015-06-13T04:08:21.248+0800 [initandlisten] can't create new thread, closing connection

And the limits and threads count of mongod instance that I collected as below:

2015-06-13_04:08:03
limits:
[mongod #28309 -- limits]
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             655350               655350               processes 
Max open files            655350               655350               files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       513987               513987               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
threads_count:
[mongod #28309 -- limits]
97
 
limits:
[mongod #28309 -- limits]
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             655350               655350               processes 
Max open files            655350               655350               files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       513987               513987               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
threads_count:
[mongod #28309 -- limits]
97

I'm confused that the threads of mongod instance is less than 1000, and less than 655350, why log show that? Is it a bug?
ps. This (RS) member have more than 10k databases and 700k collections( with indexes).
ps. I use grab-mongodb-limits-and-threads.sh to get limits and threads count info above.

grab-mongodb-limits-and-threads.sh

#!/bin/bash
 
return-limits(){
 
     for process in $@; do
          process_pids=`ps -C $process -o pid --no-headers | cut -d " " -f 2`
 
          if [ -z $@ ]; then
             echo "[no $process running]"
          else
             for pid in $process_pids; do
                   echo "[$process #$pid -- limits]"
                   cat /proc/$pid/limits
             done
          fi
     done
}
 
return-tasks(){
 
     for process in $@; do
          process_pids=`ps -C $process -o pid --no-headers | cut -d " " -f 2`
 
          if [ -z $@ ]; then
             echo "[no $process running]"
          else
             for pid in $process_pids; do
                   echo "[$process #$pid -- limits]"
                   ls /proc/$pid/task | wc -l
             done
          fi
     done
}
 
date +%F_%T
echo "limits:"
return-limits mongod
echo "threads_count:"
return-tasks mongod



 Comments   
Comment by Jiangcheng Wu [ 17/Jun/15 ]

Thank you for the note, I will post this to other discussion places to seek more help.

  • ip_local_port_range is:

    cat /proc/sys/net/ipv4/ip_local_port_range 
    32768	61000
    

    Only mongod is running on that machine, and applications will not open more than 500 connections to a SECONDARY member. It does not make sense that the machine opens multiple short-lived connections and ran out of sockets.

  • I did not see out of socket memory in syslog ( syslog seems good) , as our nofile is large enough for mongod.
    I did not collect sockets number of OS at that time, but as log show:

    2015-06-13T04:08:21.247+0800 [initandlisten] connection accepted from 10.10.11.236:46252 #66 (44 connections now open)
    

    mongod seems did not open many connections.
    The can't create new thread log show up just a few seconds before the member to get and apply oplog.

  • Will mongod open open many/huge connections to the other members and not show in log?
  • It's some like http://www.ttlsa.com/mongodb/initandlisten-connection-refused-because-too-many-open-connections-819-of-819/, but we have set numactl in upstart file.

    if [ "x$ENABLE_MONGODB" = "xyes" ]; then exec start-stop-daemon --start --quiet --chuid mongodb --exec /usr/bin/numactl -- --interleave=all /usr/bin/mongod --config /etc/mongodb.conf  --setParameter failIndexKeyTooLong=false --setParameter newCollectionsUsePowerOf2Sizes=false; fi
    

  • Another odd thing. 2015-06-12T11:37:48.343+0800 [IndexRebuilder] ERROR: madvise failed: errno:11 Resource temporarily unavailable show up in the startup progress of a REMOVED member.
Comment by Ramon Fernandez Marina [ 13/Jun/15 ]

wujiangcheng, this error is not a bug in mongod, but an indication that the OS run out of sockets for mongod to open; it can be easily triggered when the machine opens multiple short-lived connections. You may want to check things like the ip_local_port_range and the number of sockets in TIME_WAIT:

sysctl -a | grep local_port_range
netstat -nap | grep TIME_WAIT

Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Generated at Thu Feb 08 03:49:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.