Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.0.0-rc6
Affects Version/s: 2.8.0-rc4
Component/s: Storage, WiredTiger
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I update large collection (~500gb snappy compressed data) on non-sharded environment. I split collection into chunks using splitVector, 32mb per chunk. Then I perform some analysis on each record and update each record (add one small field). At the start everething is cool, performance comparable to TokuMX which I used earlier. But then performance degrades rapidly. You can find plots of some metrices attached. It is very strange that disk is not fully utilized, but cpu utilized at 100%.
My db hosted on ec2 r3.4xlarge machine. Disks - 2 * 1TB ebs combined into raid0.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

1-CPU.png
41 kB
Dec 31 2014 07:19:30 AM UTC
2-disk-read.png
54 kB
Dec 31 2014 07:19:30 AM UTC
3-disk-write.png
69 kB
Dec 31 2014 07:19:30 AM UTC
4-disk-idle.png
72 kB
Dec 31 2014 07:19:30 AM UTC
5-update-time-per-chunk.png
63 kB
Dec 31 2014 07:19:30 AM UTC
sysmon.py
2 kB
Jan 02 2015 08:02:09 PM UTC

Assignee:: Bruce Lucas (Inactive)
Reporter:: Dmitriy Selivanov
Participants:: Bruce Lucas, Daniel Pasette, Dmitriy Selivanov, Ramon Fernandez Marina
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Dec 31 2014 07:19:30 AM UTC
Updated:: Mar 23 2015 02:52:13 PM UTC
Resolved:: Mar 23 2015 02:52:13 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates