[SERVER-26040] High CPU/IOWAIT in MongoDB 3.2.9 Created: 09/Sep/16  Updated: 12/Dec/16  Resolved: 12/Dec/16

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Emil Burzo Assignee: Kelsey Schubert
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostic.data.tar     PNG File disk_utilization.png     PNG File hdd1-cpu-week.png     PNG File hdd1-diskstats_utilization-week.png     PNG File ssd1-cpu-week.png     PNG File ssd1-cpuspeed-week.png     PNG File ssd2-cpu-week.png     PNG File ssd2-cpuspeed-week.png    
Issue Links:
Related
related to SERVER-25249 High I/O load in secondary nodes with... Closed
related to SERVER-25700 Very high CPU usage Closed
Operating System: ALL
Participants:

 Description   

After upgrading a replica set to 3.2.9, I am seeing:

  • increased CPU usage on the two hosts that have SSD's
  • increased CPU and IOWAIT on the host that has regular HDDs (in software RAID1, if it makes any difference)

The workload hasn't changed before/after the upgrade (about 40-60 inserts / 1-2 queries / 0 updates / 0 deletes – per second)

The configuration also hasn't changed, I've been running WiredTiger before as well.

The storage configuration is configured as follows:

storage:
  dbPath: /var/lib/mongodb
  directoryPerDB: true
  journal:
    enabled: true
  engine: "wiredTiger"
  wiredTiger:
        engineConfig: 
            journalCompressor: zlib
        collectionConfig: 
            blockCompressor: zlib 
        indexConfig:
            prefixCompression: true

I've attached diagnostic.data as requested and some munin charts that show the difference before/after.

Seemingly related issues:

Let me know if there's anything else I can do to help.



 Comments   
Comment by Kelsey Schubert [ 12/Dec/16 ]

Hi emilburzo,

Your responses indicate that the WT journal flushing is working as designed. As previously discussed, WT journal flushing in MongoDB 3.2 does do more IO than in 3.0. Please open a new ticket if you are experiencing performance issues (e.g. slow queries) as a result of this behavior.

Kind regards,
Thomas

Comment by Emil Burzo [ 28/Sep/16 ]

It's a bare metal server from online.net

  • Dell PowerEdge R210 II
  • Ubuntu Xenial
  • software RAID1 on 2 x Seagate Constellation ES.3 (ST1000NM0033-9ZM173)
  • ext4 filesystem
Comment by Kelsey Schubert [ 25/Sep/16 ]

Hi emilburzo,

Thank you for the update. We have not observed that same impact on hard drives in our testing. To continue to investigate, would you please provide additional details about your environment? Please include a detailed description of your virtualization, filesystem, and storage layer.

Thanks again,
Thomas

Comment by Emil Burzo [ 16/Sep/16 ]

I should of tried point 4 before all the others...

Disabling the journal did it, so I guess this is a duplicate of SERVER-25249.

I do agree with the submitter though, there should be a more relaxed flush interval setting for less critical workloads.

Is there any way to accomplish that currently?

Hard drives are +5°C hotter with the current 50ms flush interval.

Comment by Emil Burzo [ 15/Sep/16 ]

Hey Thomas,

How are you generating your load to run git bisect? Are you able to provide a reproduction of this workload that we can use to benchmark?

Better steps to reproduce:

I have tested the following on a (mostly) idle machine running Ubuntu 16.04 with software RAID1 on regular 7k rpm consumer drives.

Downloaded the following MongoDB versions:

mongodb-linux-x86_64-ubuntu1604-3.2.9
mongodb-linux-x86_64-ubuntu1404-3.0.12

Unpacked the archive, switched into each version's directory.

Made some ad-hoc data dirs:

mkdir data{1..3}

Started three instances:

bin/mongod --port 17170 --dbpath data1 --replSet iotest -f ~/mongo.conf
bin/mongod --port 17171 --dbpath data2 --replSet iotest -f ~/mongo.conf
bin/mongod --port 17172 --dbpath data3 --replSet iotest -f ~/mongo.conf

Contents of mongo.conf

storage:
  dbPath: /var/lib/mongodb
  directoryPerDB: true
  journal:
    enabled: true
  engine: "wiredTiger"
  wiredTiger:
        engineConfig: 
            journalCompressor: zlib
        collectionConfig: 
            blockCompressor: zlib 
        indexConfig:
            prefixCompression: true

Initialized the replica set:

bin/mongo localhost:17170
 
rs.initiate()
rs.add("localhost:17171")
rs.add("localhost:17172")

Opened top in another terminal and eyeballed the load average and the "x.x wa" field

Ran the following query:

for (i = 0; i < 1000000; i++) {
	db.test.insert({
		field1: Math.random(),
		field2: Math.random(),
		field3: Math.random(),
		field4: Math.random(),
		field5: Math.random(),
		field6: Math.random(),
		field7: Math.random(),
		field8: Math.random(),
		field9: Math.random(),
		field10: Math.random(),
		field11: Math.random()
	});
 
	if (i % 100 == 0) {
		print(i);
		sleep(750);
	};
};

Left it running until it gets to 80 000, the load average crept up and the iowait was higher

Specific numbers on the (mostly) idle server I tried:

3.9.2:

  • 1.5-2.0 load average
  • >5.0 wa

3.0.12:

  • 0.2-0.6 load average
  • <2.0 wa

Here's a munin graph: http://i.imgur.com/3DhiLyK.png

Hope that helps.

My understanding is that you are not seeing any performance impact between versions, is that correct?

Between the 3.2.0 and 3.2.9 versions no, nothing really major – although I haven't left them running for a long time so there is a possibility.

But not as major as the difference between 3.0.12 and 3.2.0.

Are you observing this behavior on primaries or secondaries?

I'm observing this behaviour on both primaries and secondaries.

But 2 out of 3 servers are on SSDs, so the issue is not as visible.

Is journaling enabled? Are you journaling to same disk as your $dbpath?

Yes and yes.

I'm going to try and disable the journal to see if it makes any difference.

Comment by Kelsey Schubert [ 13/Sep/16 ]

Hi emilburzo,

Thank you for opening this ticket and looking into it in more depth over the weekend. We're still investigating this issue, to help would please answer the following questions?

  1. How are you generating your load to run git bisect? Are you able to provide a reproduction of this workload that we can use to benchmark?
  2. My understanding is that you are not seeing any performance impact between versions, is that correct?
  3. Are you observing this behavior on primaries or secondaries?
  4. Is journaling enabled? Are you journaling to same disk as your $dbpath?

Please be aware that we expect there to be a higher I/O load on secondary nodes in 3.2 as a result of how we sync the journal. To avoid falling behind, secondaries will self-throttle. For additional details about this behavior, please see SERVER-25249.

Thank you,
Thomas

Comment by Emil Burzo [ 11/Sep/16 ]

I've spent some time trying to further isolate the issue.

The 3.2.x releases show varying degrees of iowait%, some bigger, some smaller.

The biggest difference is between 3.0.12 (good) and 3.2.0 (bad).

Running with 3.0.12 shows an iowait% of zero, as it was before upgrading.

I've also tried to run git bisect to pinpoint the exact commit were the regression was introduced, but I couldn't test properly since I am getting stacktraces if I build from commits between r3.0.12 and r3.2.0 – presumably because the code is not in a clean state.

If you can create some stable builds, I can try them out in one of the secondaries from the replica set as the issue is reproducible as there's a visible difference in iowait% between them.

Or if you have other suggestions I'm happy to help.

Generated at Thu Feb 08 04:10:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.