[SERVER-2762] backtrace of mongos crash Created: 15/Mar/11  Updated: 12/Jul/16  Resolved: 17/Mar/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.8.0-rc1
Fix Version/s: 1.8.1, 1.9.0

Type: Bug Priority: Major - P3
Reporter: Vince Busam Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 9.10, 64bit. 10gen packages.


Operating System: Linux
Participants:

 Description   

Sat Mar 12 15:49:34 [mongosMain] connection accepted from 192.168.222.5:60022 #302810Sat Mar 12 15:49:37 [mongosMain] connection accepted from 192.168.222.5:60124 #302811Sat Mar 12 15:49:39 [conn302802] end connection 192.168.222.5:56095Sat Mar 12 15:49:43 [conn71914] MessagingPort say send() errno:32 Broken pipe 192.168.222.5:36258
Sat Mar 12 15:49:43 [conn71914] DBException in process: socket exception
Sat Mar 12 15:49:43 [conn71914] MessagingPort say send() errno:32 Broken pipe 192.168.222.5:36258
Sat Mar 12 15:49:43 [conn71949] MessagingPort say send() errno:32 Broken pipe 192.168.222.5:49664
Sat Mar 12 15:49:43 [conn71949] DBException in process: socket exception
Sat Mar 12 15:49:44 [conn71914] unclean socket shutdown from: 192.168.222.5:36258
Received signal 11
Backtrace: 0x531975 0x7f487789b530 0x67b3b2 0x581a12 0x6a2430 0x7f4878389a04 0x7f4877947d4d
/usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x531975]
/lib/libc.so.6[0x7f487789b530]
/usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0x392)[0x67b3b2
]
/usr/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x272)[0x581a12]
/usr/bin/mongos(thread_proxy+0x80)[0x6a2430]
/lib/libpthread.so.0[0x7f4878389a04]
/lib/libc.so.6(clone+0x6d)[0x7f4877947d4d]
===

Received signal 11
Backtrace: 0x531975 0x7f487789b530 0x4e879c 0x4efe0e 0x4f05c6 0x4f1254 0x61b9f4 0x638a16 0x6680bc 0x67b161 0x581a1
2 0x6a2430 0x7f4878389a04 0x7f4877947d4d
/usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x531975]
/lib/libc.so.6[0x7f487789b530]
/usr/bin/mongos(_ZN5mongo10FieldRangeC1ERKNS_11BSONElementEbb+0x18c)[0x4e879c]
/usr/bin/mongos(_ZN5mongo13FieldRangeSet17processQueryFieldERKNS_11BSONElementEb+0x5e)[0x4efe0e]
/usr/bin/mongos(_ZN5mongo13FieldRangeSetC1EPKcRKNS_7BSONObjEb+0x1c6)[0x4f05c6]
/usr/bin/mongos(_ZN5mongo15FieldRangeOrSetC1EPKcRKNS_7BSONObjEb+0x24)[0x4f1254]
/usr/bin/mongos(_ZN5mongo12ChunkManager17getShardsForQueryERSt3setINS_5ShardESt4lessIS2_ESaIS2_EERKNS_7BSONObjE+0x64)[0x61b9f4]
/usr/bin/mongos(_ZN5mongo13ShardStrategy7queryOpERNS_7RequestE+0x2f6)[0x638a16]
/usr/bin/mongos(_ZN5mongo7Request7processEi+0x29c)[0x6680bc]
/usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0x141)[0x67b161]
/usr/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x272)[0x581a12]
/usr/bin/mongos(thread_proxy+0x80)[0x6a2430]
/lib/libpthread.so.0[0x7f4878389a04]
/lib/libc.so.6(clone+0x6d)[0x7f4877947d4d]
===



 Comments   
Comment by Eliot Horowitz (Inactive) [ 17/Mar/11 ]

https://github.com/mongodb/mongo/commit/40b9bee42bc3302515028c9053823569eb85467c
https://github.com/mongodb/mongo/commit/802c11fbc082a93a6ebd2beb4aee3012916d75dc

Comment by Vince Busam [ 16/Mar/11 ]

Those were back-to-back in the log file. That was the last log before restarting. The logfile is quite large, so I'll give more excerpts.
Before this crash, the log looks like a lot of this:
Sat Mar 12 15:48:51 [mongosMain] connection accepted from 192.168.222.5:57739 #302804
Sat Mar 12 15:48:56 [mongosMain] connection accepted from 192.168.222.5:58076 #302805
Sat Mar 12 15:49:03 [mongosMain] connection accepted from 192.168.222.5:58565 #302806
Sat Mar 12 15:49:13 [conn302773] end connection 192.168.222.5:49335
Sat Mar 12 15:49:20 [conn302791] end connection 192.168.222.5:54774
Sat Mar 12 15:49:20 [mongosMain] connection accepted from 192.168.222.5:59447 #302807
Sat Mar 12 15:49:20 [conn302807] end connection 192.168.222.5:59447
Sat Mar 12 15:49:29 [mongosMain] connection accepted from 192.168.222.5:59769 #302808
Sat Mar 12 15:49:29 [conn302808] end connection 192.168.222.5:59769
Sat Mar 12 15:49:30 [conn302797] end connection 192.168.222.5:55693
Sat Mar 12 15:49:30 [mongosMain] connection accepted from 192.168.222.5:59819 #302809
Sat Mar 12 15:49:30 [conn302809] end connection 192.168.222.5:59819
Sat Mar 12 15:49:30 [conn302801] end connection 192.168.222.5:56094
Sat Mar 12 15:49:33 [conn302790] end connection 192.168.222.5:54768

Every 5 minutes this:
Sat Mar 12 15:45:11 [LockPinger] dist_lock pinged successfully for: rack6.citizennet.com:1299613398:1804289383

Nothing else for hours before the crash.

Comment by Eliot Horowitz (Inactive) [ 16/Mar/11 ]

To be clear, are those 2 different stack traces from different times, or right after the other?
Can you attach entire log?

Generated at Thu Feb 08 03:01:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.