Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.2.0
Component/s: WiredTiger
Labels:
None

Operating System:
ALL
Steps To Reproduce:

Hide

Create a replica set with the primary using zlib and the secondary using snappy. Check the CPU usage and the performance characteristics vs zlib/zlib and snappy/snappy.

zlib/snappy appears to be roughly 25% slower than zlib/zlib, which the CPU only running around 50% with zlib/snappy configured.

[Nick J workload]

Show
Create a replica set with the primary using zlib and the secondary using snappy. Check the CPU usage and the performance characteristics vs zlib/zlib and snappy/snappy. zlib/snappy appears to be roughly 25% slower than zlib/zlib, which the CPU only running around 50% with zlib/snappy configured. [Nick J workload]
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I'm playing with replica sets on 3.2. I have the following topology:

1 x i3770 with SSD [Primary]
1 x intel NUC with SSD [secondary]
1 x i5960 with SSD [arbiter]

.NET application using the C# driver (2.2).

Scenario 1 - No replica set:

My application running on a separate box connects to the primary (with no replica set configured) and has a total throughput of X.

Scenario 2 - 1 node replica set (on the primary):

Same two boxes as above, but mongod on the primary is started with --replSet. Total throughput is X - 15%. This makes sense as there is extra CPU and disk IO required.

Scenario 3 - 2 node replica with arbiter:

This time I've configured a standard replica set. Throughput drops to X - 55%.

I can see the NUC (which is very weak) is CPU-bound, and the other two boxes are barely breaking a sweat. My understanding was that replication was asynchronous and that the replication from primary to secondary would not/should not slow down writing to the primary (at least not by such a large amount). As best I can tell I don't have the write concern set to majority (unless that is the default for a cluster).

I noticed that on my primary I was using zlib compression for both journal and collection (primary has a smaller SSD) and was using snappy for the replica.

I tried using snappy on both and performance jumped up more than expected, and CPU on the primary popped up to 100%.

I also tried zlib on both primary and secondary, and this showed better performance that mixed.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

AddNodePerfDrop.png
45 kB
Dec 10 2015 06:45:10 AM UTC
ARB_metrics.2015-12-28T17-11-40Z-00000
45 kB
Dec 28 2015 05:35:16 PM UTC
PRIMARY_metrics.2015-12-28T17-06-59Z-00000
337 kB
Dec 28 2015 05:35:16 PM UTC
replica_status.png
19 kB
Dec 10 2015 06:23:22 AM UTC
SECONDARY_metrics.2015-12-28T17-08-02Z-00000
384 kB
Dec 28 2015 05:35:16 PM UTC
SINGLENODE_metrics.2015-12-28T17-36-12Z-00000
146 kB
Dec 28 2015 05:45:23 PM UTC
snappy_singlenode.png
32 kB
Dec 10 2015 06:45:10 AM UTC
snappy_snappy.png
35 kB
Dec 10 2015 06:45:10 AM UTC
zlib_snappy_cpu_primary.png
61 kB
Dec 10 2015 06:23:22 AM UTC
zlib_snappy_secondary.metrics.2015-12-10T04-35-13Z-00000
734 kB
Dec 10 2015 06:23:22 AM UTC
zlib_snappy.metrics.2015-12-10T04-37-19Z-00000
473 kB
Dec 10 2015 06:23:22 AM UTC
zlib_snappy.png
48 kB
Dec 10 2015 06:23:22 AM UTC
zlib_zlib_cpu_primary.png
39 kB
Dec 10 2015 06:23:22 AM UTC
zlib_zlib.metrics.2015-12-10T05-26-01Z-00000
273 kB
Dec 10 2015 06:23:22 AM UTC
zlib_zlib.png
40 kB
Dec 10 2015 06:23:22 AM UTC

Assignee:: Unassigned
Reporter:: Nick Judson
Participants:: Daniel Pasette, Nick Judson, Ramon Fernandez Marina
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Dec 10 2015 05:06:26 AM UTC
Updated:: Jun 20 2016 07:14:59 PM UTC
Resolved:: Jun 20 2016 07:14:59 PM UTC

Details

Description

Attachments

Attachments

Forms

Activity

People

Dates