[SERVER-29245] MongoDB high CPU usage during replicaset initial sync Created: 17/May/17  Updated: 29/Jan/18  Resolved: 22/Jun/17

Status: Closed
Project: Core Server
Component/s: Replication, WiredTiger
Affects Version/s: 3.0.14, 3.2.13
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Observant Assignee: Mark Agarunov
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File 001.png    
Participants:

 Description   

Updating MongoDB setup from version 3.0.14 to 3.2.13.
Running two node replicaset on EC2 m3.large instances using Amazon Linux.

CPU usage on primary jumps and stays to ~100% when new node (version 3.2.13) is added to replicaset and initial sync is started.

Using WiredTiger as storage engine for both versions.

Same pattern is observed in existing secondary (v3.0.14) when it's used as initial sync source. (Btw. it's not easy to setup properly).

Didn't experience this while upgrading from 2.6.x to 3.0.x.

Is it expected behaviour?
Would it be still an issue when updating from 3.2.x to 3.4.x?



 Comments   
Comment by Mark Agarunov [ 22/Jun/17 ]

Hello mongodb@observant.net,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Thanks,
Mark

Comment by Kelsey Schubert [ 09/Jun/17 ]

Hi mongodb@observant.net,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the information Mark requested?

Thank you,
Thomas

Comment by Mark Agarunov [ 19/May/17 ]

Hello mongodb@observant.net,

Thank you for the report. To better investigate this behavior, please run the following commands on nodes running MongoDB version 3.0.x or older and provide the ss.log and iostat.log files that are created:

delay=1
mongo --eval "while(true) {print(JSON.stringify(db.serverStatus({tcmalloc:true}))); sleep(1000*${delay:?})}" >ss.log &
iostat -k -t -x ${delay:?} >iostat.log &

Please leave this running until the issue happens again so that there is a complete log.

For nodes that are running a version of MongoDB newer than 3.0.x, please archive and upload the $dbpath/diagnostic.data directory.

Thanks,
Mark

Comment by Observant [ 17/May/17 ]

Gets even worse.
MMS just reported secondary instance down for 2 minutes.
Looks to be false positive though.

Generated at Thu Feb 08 04:20:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.