Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.2.5, 4.2.9
Component/s: None
Labels:
None

Operating System:
ALL
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Hello,

This issue is happening to us in several PRODUCTION environments and it's very serious.

From time to time, mongos service just hangs, applications are unable to connect to ANY of the mongos servers, and the connection just waits and eventually times out.

System.TimeoutException: A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "Automatic", Type : "Unknown", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "10.120.32.68:27017" }", EndPoint: "10.120.32.68:27017", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. ---> MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server. ---> System.TimeoutException: The operation has timed out.

I connected to the mongos via ssh and tried logging in to mongos, but the issue is the same.

From the mongos logs, we can see the following when it started, over and over again:

2020-12-12T08:06:03.901Z I - [conn1257891] operation was interrupted because a client disconnected 
2020-12-12T08:06:03.901Z I NETWORK [conn1257891] DBException handling request, closing client connection: ClientDisconnect: operation was interrupted 
2020-12-12T08:06:03.905Z I NETWORK [conn1302432] received client metadata from 10.248.127.193:18473 conn1302432: { driver: { name: "mongo-csharp-driver", version: "2.11.3.0" }, os: { type: "Linux", name: "Linux 4.15.0-64-generic #73-Ubuntu SMP T

The issue is being resolved completely when I log in to the primary config server and run the rs.stepDown() command. Once the config primary is changed, everything gets back to normal and connections are coming in.

These are the logs that appear in the cfg primary server at the same time:

2020-12-12T08:06:53.800Z I SHARDING [PeriodicShardedIndexConsistencyChecker] Checking consistency of sharded collection indexes across the cluster 
2020-12-12T08:06:53.837Z I SHARDING [PeriodicShardedIndexConsistencyChecker] Found 0 collections with inconsistent indexes 
2020-12-12T08:07:15.995Z I NETWORK [listener] connection accepted from 10.124.128.43:43410 #320308 (26 connections now open)

This issue occurred to us in version 4.2.5, I thought it was similar to https://jira.mongodb.org/browse/SERVER-47553 so I've upgraded to version 4.2.9 and it happens again and again in complete different clusters, which indicates that it is not a specific server or os issue.

I've defined this issue as Blocker - P1 since it is affecting multiple PROD environments.
The logs from the mongos and the config primary server are attached.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Hide
mongod_mongos_logs.zip
Dec 12 2020 09:47:06 AM UTC
415 kB
Ezra Levi
Extracting archive...
Show
mongod_mongos_logs.zip
Dec 12 2020 09:47:06 AM UTC
415 kB
Ezra Levi
Screen Shot 2020-12-22 at 3.35.01 PM.png
Dec 22 2020 09:25:12 PM UTC
532 kB
Edwin Zhou
Screen Shot 2020-12-22 at 3.37.33 PM.png
Dec 22 2020 09:25:12 PM UTC
522 kB
Edwin Zhou
Screen Shot 2020-12-22 at 4.21.37 PM.png
Dec 22 2020 09:25:12 PM UTC
52 kB
Edwin Zhou

duplicates

SERVER-52654 new signing keys not generated by the monitoring-keys-for-HMAC thread

Closed

Assignee:: Edwin Zhou
Reporter:: Ezra Levi
Participants:: Edwin Zhou, Eric Sedor, Ezra Levi
Votes:: 0 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Dec 12 2020 09:49:47 AM UTC
Updated:: Feb 04 2021 06:39:00 PM UTC
Resolved:: Jan 26 2021 04:31:13 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates