[SERVER-49001] Mongos router hangs Created: 22/Jun/20  Updated: 20/Aug/21  Resolved: 20/Aug/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: 03shady . Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

Hello ,

We have shards on test environment and sometimes we are facing problem with mongo router (mongos)

Developers are unable to connect to db ,when i check on server side ,im not able to run sh.status() or show dbs it hangs... After mongos restart everything  works fine .

I checked and both replica set rs0 and rs1 is reachable and works fine

Please see some logs below from mongos : 

 

 
2020-06-22T07:00:00.003+0000 I CONNPOOL [TaskExecutorPool-0] Connecting to xxx-test-mongodb04:27017
 2020-06-22T07:00:01.002+0000 I CONNPOOL [TaskExecutorPool-0] Connecting to xxx-test-mongodb01:27017
 2020-06-22T07:05:00.113+0000 I CONNPOOL [TaskExecutorPool-0] Dropping all pooled connections to xxx-test-mongodb01:27017 due to Shutdown
 InProgress: Pool for xxx-test-mongodb01:27017 has expired.
 2020-06-22T07:05:00.113+0000 I CONNPOOL [TaskExecutorPool-0] Dropping all pooled connections to xxx-test-mongodb04:27017 due to Shutdown
 InProgress: Pool for xxx-test-mongodb04:27017 has expired.
 2020-06-22T07:32:34.724+0000 I NETWORK [listener] connection accepted from 10.xx5.xx.218:59230 #1528 (6 connections now open)
 2020-06-22T07:32:34.739+0000 I NETWORK [conn1528] received client metadata from 1x.15x.93.2xx:59230 conn1528: { application:
 
{ name: "ro bo3t" }
 
, driver: \{ name: "MongoDB Internal Client", version: "4.0.5-17-gd808df2233" }, os:
 
{ type: "Windows", name: "Microsoft Windows 8", architecture: "x86_64", version: "6.2 (build 9200)" }
 
}
 2020-06-22T07:32:34.772+0000 I CONNPOOL [ShardRegistry] Connecting to xxx-test-mongodb01:27017
 2020-06-22T07:34:09.375+0000 I - [conn1528] operation was interrupted because a client disconnected
 2020-06-22T07:36:33.346+0000 I NETWORK [listener] connection accepted from 192.xxx.0.9:34634 #1530 (7 connections now open)
 2020-06-22T07:36:33.346+0000 I NETWORK [conn1530] received client metadata from 192.16x.0.xx:34634 conn1530: { application:
 
{ name: "Mong oDB Shell" }
 
, driver: \{ name: "MongoDB Internal Client", version: "4.2.0" }, os:
 
{ type: "Linux", name: "Oracle Linux Server release 7.7", architecture: "x86_64", version: "Kernel 4.1.12-124.33.4.el7uek.x86_64" }
 
}
 2020-06-22T07:36:35.571+0000 I CONNPOOL [ShardRegistry] Connecting to xxx-test-mongodb04:27017
 2020-06-22T07:36:36.614+0000 I NETWORK [listener] connection accepted from 192.168.x.x:34636 #1532 (8 connections now open)
 2020-06-22T07:36:36.614+0000 I NETWORK [conn1532] received client metadata from 192.xxx.0.9:34636 conn1532: { application:
 
{ name: "Mong oDB Shell" }
 
, driver: \{ name: "MongoDB Internal Client", version: "4.2.0" }, os:
 
{ type: "Linux", name: "Oracle Linux Server release 7.7", architecture: "x86_64", version: "Kernel 4.1.12-124.33.4.el7uek.x86_64" }
 
}
 2020-06-22T07:36:36.615+0000 I SH_REFR [ConfigServerCatalogCacheLoader-1769] Refresh for database admin from version {} to version
 
{ uui d: UUID("2ac64310-c396-4d51-933a-d7171c3c6ab9"), lastMod: 0 }
 
took 0 ms
 
2020-06-22T08:25:06.029+0000 I NETWORK [listener] connection accepted from 192.168.0.9:34735 #1546 (17 connections now open)
 2020-06-22T08:25:06.030+0000 I NETWORK [conn1546] received client metadata from 192.168.0.9:34735 conn1546: { application:
 
{ name: "MongoDB Shell" }
 
, driver: \{ name: "MongoDB Internal Client", version: "4.2.0" }, os: \{ type: "Linux", name: "Oracle Linux Server release 7.7", architecture: "x86_64", version: "Kernel 4.1.12-124.33.4.el7uek.x86_64" } }
 2020-06-22T08:25:17.330+0000 I CONNPOOL [ShardRegistry] Connecting to xxx-test-mongodb01:27017
 
  

These lines are from tail -f  ,when i try to run show dbs 

 
 

2020-06-22T08:25:17.330+0000 I CONNPOOL [ShardRegistry] Connecting to xxx-test-mongodb01:27017
 2020-06-22T08:30:17.349+0000 I CONNPOOL [ShardRegistry] Dropping all pooled connections to xxx-test-mongodb01:27017 due to ShutdownInProgress: Pool for xxx-test-mongodb01:27017 has expired.
 
  


 

[mongodb@mongodb-router]$ telnet xxx-test-mongodb01 27017
 Trying xxx.168.3.xx...
 Connected to xxx-test-mongodb01.
 Escape character is '^]'.
 
  

Please could you help with this problem ? 

 



 Comments   
Comment by Edwin Zhou [ 20/Aug/21 ]

Hi shady03@gmail.com,

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Best,
Edwin

Comment by Edwin Zhou [ 12/Aug/21 ]

Hi shady03@gmail.com,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please would you please archive (tar or zip) the mongod.log and mongos.log files and the $dbpath/diagnostic.data directory (the contents are described here) covering this behavior and upload them to this support uploader location?

Best,
Edwin

Comment by Carl Champain (Inactive) [ 23/Jun/20 ]

Hi shady03@gmail.com,

Thank you for the report.

Would you please archive (tar or zip) the mongod.log and mongos.log files and the $dbpath/diagnostic.data directory (the contents are described here) covering this behavior and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Kind regards,
Carl

Generated at Thu Feb 08 05:18:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.