-
Type:
Task
-
Resolution: Incomplete
-
Priority:
Major - P3
-
None
-
Affects Version/s: 3.4.0
-
Component/s: Sharding
-
None
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
We have a setup of 3 shards with each shard having 3 replica .
We also recently migrated from Ubuntu (Ubuntu 16.04.5 LTS) to Centos (CentOS Linux release 7.6.1810)
our mongo version is
MongoDB shell version v3.4.0
git version: f4240c60f005be757399042dc12f6addbc3170c1
OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
Problem :
shards on a normal workload perform well , There is a special case when we get a batch updates/ inserts shards completely freezes . All the diagnostic logs is frozen. As soon as the updates/inserts completed (which takes a while) shards comes back with normal response. Also any find query that are queued up runs after that (having no corelation to the database that the updates were being done)
If I look at mong.log , during the time when shard is frozen there is almost not log that is written. As soon it is freed up I see all info log of all the query ran and each query taking about 10x times than average.
I believe this is an environmental issue of running mongo on CentOs and we are probably hitting the issue (This issue was never seen while running on Ubuntu)
Also to mention we choose to use the same git version of mongo (same release)
I am not sure what trace of db.setLogLevel() will show us where mongo is getting stuck
I have tried turning on various trace but the log tracing is frozen for the time when mongo is doing that batch update and I don't get much as what is going on during that time
I would appreciate if Any help / guidance can be given to see of any other forms of tracing can be done to identify the problem