-
Type: Bug
-
Resolution: Community Answered
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Performance, Replication
-
Labels:None
-
Fully Compatible
-
ALL
TL;DR:
Powering off one of mongodb Shard members cause the others cpu`s to raise for 100%.
Background:
I want to deploy an mongodb cluster on several ESXes. The cluster have to resist two component shutdown.
Cluster Architecture (Mongo 4.2):
- 5 config servers
- 3 query servers
- shard01:
- primary
- 2 secondary
- 2 arbiter
- shard02:
- primary
- 2 secondary
- 2 arbiter
The problem:
Whenever I have been testing HA by removing one of the members. I noticed, after several minutes, that the rest of the members face to CPU spike to 100% which remains until I returned the missing member.
Tests I have been conducted:
- shutdown 1 replica -> members CPU raise to 100%
- shutdown 1 replica and 1 arbiter -> members CPU raise to 100%
- shutdown 1 arbiter -> members are OK
Things i have already checked:
- When checking the problematic VMs I noticed that the mongod is the service which consume most of the CPU (99%).
- I checked mongod for long run-time queries with db.currentOp(). Everything looks just fine.
- Mongod.log does not contain any suspicious logs.
Bbottom_line:
I published the problem in [stackoverflow |https://stackoverflow.com/questions/59491006/why-when-one-of-mongodb-replica-set-shard-members-goes-offline-the-others-cpus] and advised to report it as a bug.
Regards,
Aric