[SERVER-58114] mongos crashed after 180 days Created: 28/Jun/21 Updated: 27/Oct/23 Resolved: 04/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.11 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | li zhang | Assignee: | Dmitry Agranat |
| Resolution: | Community Answered | Votes: | 0 |
| Labels: | Bug | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
crash log: CONTROL [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
Mongo version: MongoDB shell version v4.2.11
Linux version: Linux version 3.10.0-1160.6.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Tue Nov 17 13:59:11 UTC 2020 |
| Comments |
| Comment by Dmitry Agranat [ 04/Jul/21 ] | ||||||||||||||||||
|
This is not enough information to determine what external service issued SIGTERM. I suggest looking at syslog, messages, dmesg to try to figure this out. As this does not seem to be an issue related to MongoDB, I will go ahead and close this case. Regards, | ||||||||||||||||||
| Comment by li zhang [ 30/Jun/21 ] | ||||||||||||||||||
|
Thanks Dima! Maybe mongo got signal 15 from itself, there's something in detail,I uploaded new logs. we have 6 config server and 2 mongos, some servers got signal 15.
other servers are ok, our app is ok until 2021-06-10 19:08:46:666 Timed out after 30000 ms while waiting for a server that matches com.mongodb.client.internal.MongoClientDelegate$1@28afd302. Client view of cluster state is {type=SHARDED, servers=[{address=10.27.0.16:23001, type=UNKNOWN, state=CONNECTING, exception= {com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.net.SocketTimeoutException: Read timed out}}, {address=10.27.0.15:23001, type=UNKNOWN, state=CONNECTING, exception= {com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.net.SocketTimeoutException: Read timed out}}].
I am sure the mongos process is still alive,we can connect it but it can not service after 2021-06-10 19:08:04 GMT+8 (HMAC Key expiresAt 1623323284).
configs:PRIMARY> db.system.keys.find() { "_id" : NumberLong("6905325084227928095"), "purpose" : "HMAC", "key" : BinData(0,"KkNA0g9Nkevn1T6KF4CRCfUNAfU="), "expiresAt" : Timestamp(1615547284, 0) } { "_id" : NumberLong("6905325084227928096"), "purpose" : "HMAC", "key" : BinData(0,"4Rx9aW2uxjfKwG3pztFPfw4HqVg="), "expiresAt" : Timestamp(1623323284, 0) }
Thanks, li zhang
| ||||||||||||||||||
| Comment by Dmitry Agranat [ 29/Jun/21 ] | ||||||||||||||||||
|
This is not a crash - a SIGTERM (signal 15) means that something is killing the mongod process and it is exiting normally (you can do this manually by using the kill command from the console/command line). The real question is: what is sending the signal to the mongod process? Unfortunately, with the information you have provided, we cannot determine the source from the logs, and it can be quite difficult to do so in general. We suggest checking for any external process monitoring, or watchdog-type processes on this server that might be killing mongod. Thanks, |