[SERVER-5003] MongoS Crashes consistently upon connect signal 11 Created: 17/Feb/12  Updated: 11/Jul/16  Resolved: 18/Feb/12

Status: Closed
Project: Core Server
Component/s: Internal Code, Shell
Affects Version/s: 2.0.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Theo Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: crash
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OS: CentOS release 5.6 (Final)
MEM: 24GB


Operating System: Linux
Participants:

 Description   

1. Startup mongos and the log looks clean
2. Attemp to connect to the mongos and it crashes

Fri Feb 17 12:37:29 ./mongos db version v2.0.0, pdfile version 4.5 starting (--help for usage)
Fri Feb 17 12:37:29 git version: 695c67dff0ffc361b8568a13366f027caa406222
Fri Feb 17 12:37:29 build info: Linux domU-12-31-39-16-30-A2 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 BOOST_LIB_VERSION=1_45
Fri Feb 17 12:37:29 [mongosMain] waiting for connections on port 27000
Fri Feb 17 12:37:29 [websvr] admin web console waiting for connections on port 28000
Fri Feb 17 12:37:29 [Balancer] about to contact config servers and shards
Fri Feb 17 12:37:29 [Balancer] updated set (t01_w_replicaset_1) to: t01_w_replicaset_1/*******01001.domain.domain.com:27017,*******01002.domain.domain.com:27017
Fri Feb 17 12:37:29 [ReplicaSetMonitorWatcher] starting
Fri Feb 17 12:37:29 [Balancer] updated set (t01_w_replicaset_2) to: t01_w_replicaset_2/*******01003.domain.domain.com:27017,*******01004.domain.domain.com:27017
Fri Feb 17 12:37:29 [Balancer] updated set (t01_w_replicaset_3) to: t01_w_replicaset_3/*******01005.domain.domain.com:27017,*******01006.domain.domain.com:27017
Fri Feb 17 12:37:29 [Balancer] config servers and shards contacted successfully
Fri Feb 17 12:37:29 [Balancer] balancer id: ********c01001:27000 started at Feb 17 12:37:29
Fri Feb 17 12:37:29 [Balancer] created new distributed lock for balancer on ********c01001.domain.domain.com:27017 ( lock timeout : 900000, ping interval : 30000, process : 0 )
Fri Feb 17 12:37:29 [LockPinger] creating distributed lock ping thread for ********c01001.domain.domain.com:27017 and process ********c01001:27000:1329511049:1804289383 (sleeping for 30000ms)
Fri Feb 17 12:37:29 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3eba891888d108e13b82a8
Fri Feb 17 12:37:29 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:37:39 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3eba931888d108e13b82a9
Fri Feb 17 12:37:39 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:37:49 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3eba9d1888d108e13b82aa
Fri Feb 17 12:37:49 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:37:59 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3ebaa71888d108e13b82ab
Fri Feb 17 12:37:59 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.

LOGING FROM THE SHELL

Fri Feb 17 12:39:19 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:39:22 [mongosMain] connection accepted from 10.198.242.31:40252 #1
Fri Feb 17 12:39:29 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3ebb011888d108e13b82b4
Fri Feb 17 12:39:29 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:39:39 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' acquired, ts : 4f3ebb0b1888d108e13b82b5
Fri Feb 17 12:39:39 [Balancer] distributed lock 'balancer/********c01001:27000:1329511049:1804289383' unlocked.
Fri Feb 17 12:39:44 [conn1] authenticate:

{ authenticate: 1.0, user: "admin", nonce: "1b7079499afb5ef9", key: "b8511e225a43081bc9ced38bca26e896" }

Fri Feb 17 12:39:44 [conn1] creating WriteBackListener for: ********01001.domain.domain.com:27017 serverID: 4f3eba891888d108e13b82a7
Fri Feb 17 12:39:44 [conn1] creating WriteBackListener for: ********01002.domain.domain.com:27017 serverID: 4f3eba891888d108e13b82a7
Fri Feb 17 12:39:44 [conn1] creating WriteBackListener for: ********01003.domain.domain.com:27017 serverID: 4f3eba891888d108e13b82a7
Fri Feb 17 12:39:44 [conn1] creating WriteBackListener for: ********01004.domain.domain.com:27017 serverID: 4f3eba891888d108e13b82a7
Received signal 11
Backtrace: 0x4a5655 0x753940 0x3aada6b0f2 0x2adc9d0de33f 0x47734b90
[0x4a5655]
[0x753940]
/lib64/libc.so.6(fgets_unlocked+0x22)[0x3aada6b0f2]
/lib64/libnss_files.so.2[0x2adc9d0de33f]
[0x47734b90]
===



 Comments   
Comment by Theo [ 18/Feb/12 ]

Looks like it is related to https://jira.mongodb.org/browse/SERVER-4167

Confirmed that issue is resolved with 2.0.2.

Comment by Eliot Horowitz (Inactive) [ 18/Feb/12 ]

This is with 2.0.0? Can you try 2.0.2?

Generated at Thu Feb 08 03:07:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.