[SERVER-5831] server stops processing requests during periodic flush Created: 12/May/12 Updated: 10/Dec/14 Resolved: 22/Nov/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency |
| Affects Version/s: | 2.0.4, 2.1.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jared Rosoff | Assignee: | Andy Schwerin |
| Resolution: | Duplicate | Votes: | 2 |
| Labels: | performance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
multiple. have run on my mac and on aws. |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
client is inserting records as quickly as it can. mongostat output:
You can see that during the few seconds before the flush happens, inserts drop to zero. then about a minute later we see it again
I've attached the test program that recreates this. |
| Comments |
| Comment by Antoine Girbal [ 24/Aug/12 ] |
|
Just for the record from an email thread.. |
| Comment by Jared Rosoff [ 11/Jun/12 ] |
|
Attaching test results showing impact of sync delay. I only ran each test for 2 minutes. Would probably be good to run each one for longer periods of time. First worksheet of attached xls shows summary of each syncdelay setting for same workload. Subsequent pages have the raw data. |
| Comment by Eliot Horowitz (Inactive) [ 14/May/12 ] |
|
Maybe. |
| Comment by Andy Schwerin [ 14/May/12 ] |
|
Presumably we're the party making the disk busy. We might want, in the future, to at least smooth out this behavior, so that there aren't multi-second blocks in which no writes are possible. |
| Comment by Eliot Horowitz (Inactive) [ 14/May/12 ] |
|
Not sure there is much we can do. |
| Comment by Jared Rosoff [ 13/May/12 ] |
|
Journal on separate device seems to fix it. Here's output on an ec2 machine with journal & data volumes on their own EBS volumes: 42700 0 0 0 0 1 0 1.95g 4.66g 1.6g 0 50.1 0 0|0 0|1 23m 1k 2 15:51:19 |
| Comment by Eliot Horowitz (Inactive) [ 13/May/12 ] |
|
Ah. |
| Comment by Jared Rosoff [ 13/May/12 ] |
|
BTW, running with --nojournal makes this go away. Could the journal group commit be blocked on completion of the background flush? output with mongod --nojournal 17500 0 0 0 0 1 0 5.95g 8.33g 1005m 103 0 0|0 0|1 9m 1k 2 21:46:55 |
| Comment by Jared Rosoff [ 13/May/12 ] |
|
Are the db.serverStatus().backgroundFlushing stats useful? Here was the mongostat output during the pause: 14000 0 0 0 0 1 0 3.95g 10.3g 2.29g 96.7 0 0|0 0|1 7m 1k 3 21:17:03 Here was the db.serverStatus().backgroundFlushing immediately after the pause > db.serverStatus().backgroundFlushing |
| Comment by Andy Schwerin [ 13/May/12 ] |
|
Interesting. 2 questions. 1, is it possible to tell how long the flushAll() operation takes? 2, does it even make sense to periodically flush like this when durability is enabled? The durability subsystem separately handles flushing the shared view to disk and committing journal writes, leaving open the question of why the background flush thread would even be necessary with durability. |
| Comment by Jared Rosoff [ 13/May/12 ] |
|
two writer processes: 19300 0 0 0 0 1 0 1.95g 6.32g 1.26g 105 0 0|1 0|2 10m 1k 3 20:05:01 three writer processes: 10100 0 0 0 0 1 0 3.95g 10.3g 1.89g 93.3 0 0|2 0|3 5m 1k 4 20:08:02 |
| Comment by Scott Hernandez (Inactive) [ 13/May/12 ] |
|
Can you run more than one process and see if the behavior changes? |
| Comment by Jared Rosoff [ 13/May/12 ] |
|
ext4. |
| Comment by Eliot Horowitz (Inactive) [ 13/May/12 ] |
|
What linux filesystem? |