[SERVER-16131] Log File Blowing up on sharded ycsb run in EC2 (or inserts too slow) Created: 13/Nov/14  Updated: 14/Apr/16  Resolved: 17/Feb/15

Status: Closed
Project: Core Server
Component/s: Logging
Affects Version/s: 2.8.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Daly Assignee: David Daly
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-16152 WiredTiger out of disk space results ... Closed
is related to SERVER-16773 Performance degradation due to TCMall... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

2 shard config, with ycsb database sharded, but not pre-split. Run load phase for 200 M documents. The log file on the shard 0 primary starts to grow almost immediately.

Participants:

 Description   

Running ycsb against a two shard cluster, each shard a 3 node repl set on EC2 running 2.8.0-rc0 using mmapv1. Load phase of 200M documents. The log file grows by about 9 GB/hr mostly on slow insert log statements. Running 8 mongos, with ycsb load generation co-located with mongos.

mongods running on c3.4xlarge AMI instances. Data on local ssd. Log file to different local ssd.

Increasing the slowms to 200 resolves the problem.



 Comments   
Comment by David Daly [ 17/Feb/15 ]

This has gone away now. Insert performance has been fixed. Was related to SERVER-16773

Comment by Daniel Pasette (Inactive) [ 18/Nov/14 ]

David, it's still unclear to me if this is being blamed on a slowdown in mongos or in mongod insert perf.

Comment by David Daly [ 17/Nov/14 ]

Some more data.
My benchmark clients are seeing average insert latency of ~26 ms with 2.6.5, and ~86 ms with 2.8.0-rc0. There's a matching decrease in throughput. mmapv1 getting ~369 inserts/sec per client (2952 overall) and 2.6.5 getting ~1240 inserts / sec per client (9920 overall). So in this case 2.8.0-rc0 is 3.3x slower than 2.6.5.

Comment by David Daly [ 14/Nov/14 ]

Basic problem is that this is a reasonably configured system and didn't behave like this on 2.6.5.

Most likely this should be considered a performance issue against inserts.

I'm collecting more data to attach on insert latency.

Comment by Scott Hernandez (Inactive) [ 14/Nov/14 ]

What is the bug? "Excessive" logging for slow operations?

Slow operation are logged and always have been, so this doesn't seem surprising or unexpected, nor a new behavior.

Generated at Thu Feb 08 03:40:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.