[SERVER-31903] after repset add member , server become very slow Created: 10/Nov/17  Updated: 26/Jan/18  Resolved: 26/Dec/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: hmy Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File 10.3.16.6.tgz     File 10.3.16.61.tgz     File 10.3.16.62.tgz     File 10.3.16.9.tgz     File 10.3.20.144.tgz     File 10.3.20.168.tgz     File 10.3.20.170.tgz     PNG File bottlenecks.png     PNG File event.png     PNG File ftdc.png     PNG File ftdc2.png     PNG File lag2.png     File syslog.2.gz     PNG File wt.png    
Participants:

 Description   

I have a 3.2.0 mongodb repset run on ubuntu Ubuntu 14.04.3 LTS.
have one primary and 3 SECONDARY and one arbiter.
when I add 3 member, the primary become more and more slow query log. and almost all slow query cost time in get lock.
when I remove new member, and restart the primary server, all ok!
attachment is the primary log.



 Comments   
Comment by hmy [ 20/Dec/17 ]

ok, thank you!

Comment by Mark Agarunov [ 01/Dec/17 ]

Hello hmy,

Thank you for the information. I believe this behavior may be caused by a few factors. There appears to be 90 listCollections operations per second, each of which causes a lock and degrades the performance. Additionally, mongod looks to be version 3.2.0, which is lacking many performance improvements that were made in later versions, in addition to improved diagnostics. My recommendation would be to upgrade to the latest version of MongoDB 3.2, which is currently 3.2.18, and see if the behavior persists.

Thanks,
Mark

Comment by hmy [ 13/Nov/17 ]

10.3.16.6 is the master

10.3.16.9 , 10.3.20.168,10.3.16.170 is the new slave node. 10.3.16.6.tgz 10.3.16.9.tgz 10.3.16.61.tgz 10.3.16.62.tgz 10.3.20.144.tgz 10.3.20.168.tgz 10.3.20.170.tgz

Comment by Mark Agarunov [ 10/Nov/17 ]

Hello hmy,

Thank you for the report. To get a better idea of what may be causing this, could you please provide the following:

  • The complete log files from the secondary which is causing this when added to the replicaset
  • An archive (tar or zip) of the $dbpath/diagnostic.data directory from both the primary and the secondary

This should give some insight into what may be causing this.

Thanks,
Mark

Comment by hmy [ 10/Nov/17 ]

even the hearbeat is log into slow query log.

Generated at Thu Feb 08 04:28:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.