[SERVER-36877] Primary slows down when a secondary becomes down Created: 27/Aug/18 Updated: 04/Jul/22 Resolved: 29/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Maxim Noskov | Assignee: | Nick Brewer |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
Hello! We have the following replica set: 1 arbiter, 1 primary and 1 secondary. versions: primary: mongodb-org-server 3.6.5 secondary: mongodb-org-server 3.6.6 arbiter: mongodb-org-server 3.6.5 If the secondary goes down (stop, for example) and becomes "not reachable/healthy", after some period of time (from 1 to 2 hours) the primary will slow down several times: CPU usage on the primary server will increase significantly, amount of avalible concurrent transactions will decrease and become unstable. If the secondary becomes up the primary will return to normal state almost immediatly. Or if we exclude the secondary from replica set, primary will also return to normal state and will show ususal performance. Usually we have from 1000 to 30000 updates and from 1000 to 6000 reads per second. We also use only primarypreferred read preference and C# driver to connect to DB. What is wrong? How can we fix or avoid such unexpected behavior? Thanks in advance! |
| Comments |
| Comment by Tejas Jadhav [ 04/Jul/22 ] | ||
|
Nick,
Any reason why primary has to store the oplog data in cache? Isn't oplog data already present on disk? Why not use that? | ||
| Comment by Nick Brewer [ 29/Aug/18 ] | ||
|
ahrenizm Glad to hear it worked. -Nick | ||
| Comment by Maxim Noskov [ 29/Aug/18 ] | ||
|
Thank you! replication.enableMajorityReadConcern: false helped solve the problem. In the future, we will think about replacing the arbiter with a new data-bearing node | ||
| Comment by Nick Brewer [ 28/Aug/18 ] | ||
|
ahrenizm A few things I want to note about disabling the "majority" read concern:
With regard to your question:
A voting majority is necessary to select a primary; it cannot be accomplished solely via priority. The safest process would be to retainthe existing arbiter, add a new data-bearing member, and then remove the arbiter once you've confirmed that the new member is added. -Nick | ||
| Comment by Nick Brewer [ 27/Aug/18 ] | ||
|
ahrenizm Another option: with your current PSA setup, you should be able to prevent this behavior by disabling read concern majority in your mongod configuration file(s):
-Nick | ||
| Comment by Maxim Noskov [ 27/Aug/18 ] | ||
|
The answer to your question: | ||
| Comment by Nick Brewer [ 27/Aug/18 ] | ||
|
ahrenizm MongoDB 3.6 enables the majority read concern automatically. This majority is calculated against the total number of nodes in your replica set (3), however, it can only be serviced by nodes that contain data (2, in your case). With your secondary down, you have only a single data bearing node, so your replica set is not able to satisfy the read concern. When your replica set is not able to satisfy the read concern, your primary is required to store oplog data in the cache, which can result in the sort of resource utilization you're seeing. I would suggest:
One question: when you remove the secondary from the replica set, are you keeping the arbiter as a member of it? Edit: To be able to satisfy a possible incoming query with read concern majority, the primary is forced to store all updates that occurred since the last write acknowledged by a majority of data-bearing nodes (written to both the primary and secondary, in this case) in the cache. Under typical operation, the majority commit point moves along as new data is inserted, and older versions of the data may be successfully evicted. However, in this case, the secondary went down, which meant the majority commit point could not progress. This resulted performance degradation due to increased cache pressure as the primary continued to take writes without being able to evict older versions of the data. -Nick |