[SERVER-3663] Mongod on windows performance degrades over time Created: 22/Aug/11 Updated: 17/Mar/16 Resolved: 17/Mar/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Stability |
| Affects Version/s: | 1.8.2, 2.0.0-rc0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 2 |
| Labels: | Windows | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 64 bit |
||
| Attachments: |
|
| Operating System: | Windows |
| Participants: |
| Description |
|
Started mongod in windows and ran manyConnectionsTest.py from my mac in 3 different shells. For one such run, performance initially stabilized at roughly 2000 reads/sec and 180 writes/sec. After a period of fluctuating read and write rates, it re-stabilized at 2000 reads/sec with only 128 writes/sec. After another burst of unstable performance, it re-stabilized at roughly 700 reads/sec and 83 writes/sec. Then there was no more great variability in performance - instead the write rate went slowly but steadily down as the read rate slowly went up. I checked in periodically and saw QPSs of 718 reads/sec w/ 61 writes/sec, 733 reads/sec w/ 46 writes/sec, 755 reads/sec w/ 23 writes/sec, and 763 reads/sec w/ 163 writes/sec. At around this point, performance suddenly fell away to zero. At this point, the python processes running manyConnectionsTest froze, printing an error message saying the find_one operation timed out. At this point, however, it was still possible to create new connections to the mongod, and it could process queries from those new connections. |
| Comments |
| Comment by Dwight Merriman [ 07/Jan/12 ] |
|
for non-SRM, if we wrapped everything with a semaphore allowing say, 100 masximum concurrent actors, probably probably goes away. |
| Comment by Dwight Merriman [ 04/Sep/11 ] |
|
this seems to be because of thread contention with ~1000 threads and slow writes. it then behaves poorly. using SlimReaderWriterLock solves on windows. so we need to have a built that way. also need to test on a couple of other platforms may happen elsewhere. if the rwlock is done by the OS then it is probably ok as the scheduler will be aware – so probably ok on Linux. |
| Comment by Eliot Horowitz (Inactive) [ 23/Aug/11 ] |
|
@dwight - any ideas? if not - can re-assign for a deeper dive |
| Comment by Spencer Brody (Inactive) [ 22/Aug/11 ] |
|
Tried to reproduce on linux and failed. |