[SERVER-18562] YCSB load phase (insert only) push 16 core machine to 100 % due to high resoultion timers Created: 19/May/15 Updated: 03/Jun/19 Resolved: 03/Jun/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | 3.1.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Eitan Klein | Assignee: | DO NOT USE - Backlog - Platform Team |
| Resolution: | Done | Votes: | 1 |
| Labels: | 32qa | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Operating System: | Windows | ||||||||||||||||||||||||
| Sprint: | Platform 6 07/17/15, Platform 7 08/10/15 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
MongoDB shell version: 3.1.3-pre- Environment: • Single mongod with wiredtiger as storage engine Workload: • Used YCSB load Issue - During insert only workload it's appear that high resolution counter which responsible to notify if operation take longer then X (100msec default) consume 60% of the CPU The impact is so big that it mask the different between SSD drive to spin disk on windows. |
| Comments |
| Comment by Mark Benvenuto [ 30/Jun/15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I ran the following tests to compare the impact of various time APIs on different platforms. I was not interested in comparing which platform is faster overall. I wanted to understand the relative performance of the various timing apis on each platform. Using https://github.com/DigitalInBlue/Celero, a micro benchmark framework, I evaluated the following time sources on 3 platforms. I ran 10 samples of 1000000 calls each in each case.
Note: curTimeMicros64 comes from MongoDB's time_support.cpp Test Platforms
Results
Summary On the Azure platforms, we see slower times for all counters, but the relative time source performance is as expected. Also, overall, 2012 R2 is better then 2008 R2 in this micro benchmark. The surprising thing is the QPC is significantly slower, almost 4x, then the other time sources on EC2. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Andy Schwerin [ 24/Jun/15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
eitan.klein, why is this not a dupe of | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Andy Schwerin [ 24/Jun/15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
We should use QueryPerformanceCounter on systems where rdtsc is fast, as GetTickCount64 has 10ms resolution (actually, probably 1ms). | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Eitan Klein [ 22/Jun/15 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Per our discussion, I think we should use GetTickCount64() API for the tracing methods should be good for our monitor system, and believe it significant faster |