[SERVER-53540] DBException handling request, closing client connection: ClientDisconnect: operation was interrupted Created: 30/Dec/20  Updated: 01/Jun/22  Resolved: 12/Jan/21

Status: Closed
Project: Core Server
Component/s: Logging
Affects Version/s: 4.2.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sourabh Ghosh Assignee: Edwin Zhou
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2021-01-11 at 12.59.32 PM.png     PNG File Screen Shot 2021-01-11 at 12.59.41 PM.png    
Issue Links:
Duplicate
duplicates SERVER-52654 new signing keys not generated by the... Closed
Operating System: ALL
Participants:
Case:

 Description   

Hello Team,
Hope you all are doing well, recently we have faced an issue on our production system wherein mongos stops communicating with cluster and version we are using is 4.2.8.
wherein we tried login through mongos it was getting stuck on 
Connecting to 127.0.0.1:27017 and nothing it was moving forward. where we checked the logs on mongos and we found something network error 
NETWORK [conn1524750] DBException handling request, closing client connection: ClientDisconnect: operation was interrupted.
we tried restarting mongos and redeploying mongos but still the issue remains
but it was able to communicate when we perform telnet and everything on all the nodes in shards and config  was intact.
and we checked the logs on the config server which was giving ssl handshake received but server is started without ssl support mongodb 
but we are not using certificate/ssl based authentication it is simple key-file based authentication.
Later on we step down the primary on the config server and everything starts as normal after electing the new primary.

We found the similar scenarios wherein stepdown/restart config server fix the issue, so do we have any other fix apart from upgrade, this seems to be serious issue kindly let us know.
MonngoDB version we are using 4.2.8   
https://jira.mongodb.org/browse/SERVER-47553
https://jira.mongodb.org/browse/SERVER-48709
https://jira.mongodb.org/browse/SERVER-52654

 

 



 Comments   
Comment by Edwin Zhou [ 12/Jan/21 ]

Hi aayushi.mangal@visiblealpha.com,

Thanks for your provided output.

{
  "_id" : NumberLong("6844829550740766751"),
  "purpose" : "HMAC",
  "expiresAt" : ISODate("2020-12-29T10:34:30Z")
}

The expiration date matches the timestamp when the incident began, which confirms that this is SERVER-52654. I'll be closing the issue as a duplicate.

MongoDB 4.2.12 has not yet been released, but I expect it to be released later this month.

Best,
Edwin

Comment by Aayushi Mangal [ 12/Jan/21 ]

Thank you  Edwin for confirming.

Please find details required:

mongos> db.getSiblingDB("admin").system.keys.find().map(k => \{ return { _id: k._id, purpose: k.purpose, expiresAt: new Date(k.expiresAt.getTime()*1000) }})
[
 {
 "_id" : NumberLong("6844829550740766750"),
 "purpose" : "HMAC",
 "expiresAt" : ISODate("2020-09-30T10:34:30Z")
 },
 {
 "_id" : NumberLong("6844829550740766751"),
 "purpose" : "HMAC",
 "expiresAt" : ISODate("2020-12-29T10:34:30Z")
 },
 {
 "_id" : NumberLong("6911641069859897348"),
 "purpose" : "HMAC",
 "expiresAt" : ISODate("2021-03-29T11:37:19Z")
 },
 {
 "_id" : NumberLong("6911641069859897349"),
 "purpose" : "HMAC",
 "expiresAt" : ISODate("2021-06-27T11:37:19Z")
 }
]

 

Also I would like to confirm, as this suggested 4.2.12 is fixed version and we need to upgrade that, so can you please redirect me to that release, as I could not found that.

Comment by Edwin Zhou [ 11/Jan/21 ]

Hi aayushi.mangal@visiblealpha.com,

Thank you for uploading the logs and data that I requested. I agree with your colleague's suspicion that you're hitting SERVER-52654. To confirm, we can compare the expiry dates of your HMAC keys and the timestamp of when the issue begins.

Would you send the output of

db.getSiblingDB("admin").system.keys.find().map(k => { return { _id: k._id, purpose: k.purpose, expiresAt: new Date(k.expiresAt.getTime()*1000) }})

for the mongos that hit this issue as detailed in the user summary box for SERVER-52654?

Thanks,
Edwin

Comment by Aayushi Mangal [ 08/Jan/21 ]

Hi Edwin,

mongos log also uploaded to your protal

2020-12-29T10:34:50.272+0000 I NETWORK [conn8593003] DBException handling request, closing client connection: ClientDisconnect: operation was interrupted

Comment by Edwin Zhou [ 07/Jan/21 ]

Hi aayushi.mangal@visiblealpha.com,

Thank you, I have received the files in the upload portal. Do you have the exact timestamp when the network error is thrown?

Best,
Edwin

Comment by Aayushi Mangal [ 07/Jan/21 ]

Hi Edwin,

Please confirm if you receive files, i have uploaded in your portal.

Comment by Edwin Zhou [ 06/Jan/21 ]

Hi aayushi.mangal@visiblealpha.com,

I'm unable to find your attached diagnostics in the attachments field. I've created a secure upload portal for you where you can upload your diagnostic data. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Thanks,
Edwin

Comment by Aayushi Mangal [ 06/Jan/21 ]

Hi Edwin,

Thank you for your response.

Please find attached diagnostic details from 1 mongos and 1 config and 1 shard servers from our cluster.

We have 8 shards - 3 members each, and 3 config servers with multiple mongos in our cluster. Also please check if you could keep the shared directory  private to your MongoDB team only.

Let us know in case we are missing anything.

Comment by Edwin Zhou [ 05/Jan/21 ]

Hi sourabh.ghosh@visiblealpha.com,

Would you please archive (tar or zip) the $dbpath/diagnostic.data directory (the contents are described here) and attach it to this ticket?

Best,
Edwin

Generated at Thu Feb 08 05:31:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.