[SERVER-24306] 40-second journaling stall from "log files prepared" to checkpoint Created: 27/May/16 Updated: 23/Nov/16 Resolved: 01/Jun/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.2.5, 3.2.6 |
| Fix Version/s: | 3.2.7, 3.3.8 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Michael Cahill (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-only | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Backport Completed: | |||||
| Participants: | |||||
| Description |
Following stack, captured at C above, seems to be typical of stalled operations:
Attaching diagnostic.data and stack traces (with timestamps) captured every 5 seconds during run. Note: above repro was with the journal on SSD, so it reached the point where the "log files used" and "log files prepared" counters were bumped fairly quickly, before the next checkpoint. With journal on HDD, the problem still reproduces, but more the operation rate is much slower, so it takes much longer (several checkpoints) to reach the point where "log files used" and "log files prepared" bump, at which point operations stall until the next checkpoint. |
| Comments |
| Comment by Githook User [ 28/Jul/16 ] |
|
Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}Message: Import wiredtiger-wiredtiger-2.8.0-592-g848e5f5.tar.gz from wiredtiger branch mongodb-3.2 ref: 8b7110b..848e5f5 This commit replaces a number of previous backports with the original
|
| Comment by Githook User [ 27/Jul/16 ] |
|
Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}Message: Import wiredtiger-wiredtiger-2.8.0-592-g848e5f5.tar.gz from wiredtiger branch mongodb-3.2 ref: 8b7110b..848e5f5 This commit replaces a number of previous backports with the original
|
| Comment by Githook User [ 26/Jul/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message:
|
| Comment by Githook User [ 26/Jul/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message:
|
| Comment by Michael Cahill (Inactive) [ 01/Jun/16 ] |
|
bruce.lucas, thanks as always for the clear report and simple repro. This should be fixed by the latest drops into master and v3.2 (for 3.2.7). Note that this pause would only happen in single-threaded workloads: when there was a log file switch, log_flush was waiting at the beginning of the new log file for the write point to advance, but it wasn't moving into the new log file until something was written to it. With multiple threads writing, something would be written quickly and the stall wouldn't happen. |
| Comment by Githook User [ 01/Jun/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Import wiredtiger-wiredtiger-2.8.0-209-g234b68b.tar.gz from wiredtiger branch mongodb-3.2 ref: 88b898e..234b68b
|
| Comment by Githook User [ 01/Jun/16 ] |
|
Author: {u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}Message: Import wiredtiger-wiredtiger-2.8.0-449-gff108d7.tar.gz from wiredtiger branch mongodb-3.4 ref: 6f9a7a4..ff108d7
|
| Comment by Githook User [ 01/Jun/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message:
(cherry picked from commit b89aaece7b2a58d183a0a2b33e20157ad7f02258) |
| Comment by Githook User [ 01/Jun/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message:
|
| Comment by Githook User [ 31/May/16 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message:
|
| Comment by Bruce Lucas (Inactive) [ 27/May/16 ] |
|
Does not repro in 3.2.0, 3.2.3, or 3.2.4, but does in 3.2.5 and 3.2.6. |