[SERVER-22730] MongoDB becomes really slow after changes on Replica Set Created: 18/Feb/16  Updated: 05/Mar/16  Resolved: 05/Mar/16

Status: Closed
Project: Core Server
Component/s: Admin, Replication
Affects Version/s: 3.2.1
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Ricardo Hilsenrath Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

I'm not sure, but I think that having an unavailable member causes some extra load on the primary server, causing a huge decrease on the performance of the server

Participants:

 Description   

I have the following Replica Set

Server/Member 1 - Primary
Server/Member 2 - Secondary (can become primary)
Server/Member 3 - Secondary (cannot become primary, just using for backup purposes)
Arbiter 1 - not in Replica Set, I just use it sometimes when I need to remove one of the members from the replica set

After my primary member crashed (bug reported: SERVER-22617), I needed to shutdown member 3 to copy all files and restore my primary server without an initial sync (the oplog window wasn't enough, the server stood down for 12h+).

I followed this steps:

  • remove member 1 from replica set (member 2 was the primary server for the past 12 hours)
  • add arbiter 1 to the replica set
  • shutdown member 3

After 1 or 2 minutes, my primary server became REALLY slow, I started member 3 and it didn't fix the slowness. I waited for 5 minutes after starting the member 3 to the replica to check if it was going to become normal again, but only after restarting the service (member 2 - primary) it fixed the slowness.

If you need any logs or information, I'll be happy to upload it for you, if possible on a private location



 Comments   
Comment by Ramon Fernandez Marina [ 05/Mar/16 ]

ricardo_fanatee, without logs or the contents of the diagnostic.data there's not much we can do to investigate this issue, so I'm going to close this ticket for the time being. If you manage to either reproduce the problem or gather useful information please send it via the private upload portal and we'll reopen the ticket to investigate further.

Thanks,
Ramón.

Comment by Ricardo Hilsenrath [ 24/Feb/16 ]

Thomas,

I'm not able to easily reproduce this issue, since I need to use my secondary server as primary on my production environment

Maybe it'll take some time to gather all this data from my production servers

Comment by Kelsey Schubert [ 22/Feb/16 ]

Hi ricardo_fanatee,

Thanks for opening this ticket. To continue to investigate this behavior, can you provide the following information?

  1. the output of db.currentOp() on your primary server when you are observing this issue
  2. the logs of each member of the replica set during this issue
  3. diagnostic.data from the dbpath of the primary

I have created a secure upload portal here. As before, the files you upload will only be visible to MongoDB employees investigating this issue.

Kind regards,
Thomas

Comment by Ricardo Hilsenrath [ 18/Feb/16 ]

I just tried again to turn off a member, but this time I didn't add an arbiter and didn't remove a server, I just shutdown the secondary server and the same slowness happened. Restarting mongod service on the primary member fixed the issue

This time I didn't start the server I removed before restart the primary member, so my replica set is working normally with an unavailable member

Generated at Thu Feb 08 04:01:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.