[SERVER-18014] Dropping a collection can block creating a new collection for an extended time under WiredTiger Created: 13/Apr/15 Updated: 15/Dec/15 Resolved: 27/Apr/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.0.1 |
| Fix Version/s: | 3.0.3, 3.1.3 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Michael Cahill (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | ET | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Completed: | |||||||||||||
| Participants: | |||||||||||||
| Description |
|
Issue Status as of Apr 29, 2015 ISSUE SUMMARY USER IMPACT WORKAROUNDS AFFECTED VERSIONS FIX VERSION Original descriptionDropping a collection can take a long time under WiredTiger because can require freeing a large number of allocated buffers. See While this is occuring creating a new collection may be blocked for a substantial time. The create was accomplished in this test by inserting a record into a non-existent collection, so the stack traces below show the createCollection happening within insertOne. It is blocked in two different places:
|
| Comments |
| Comment by Githook User [ 18/May/15 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Don't try to checkpoint dead handles. If a handle is busy when a checkpoint starts (e.g., in the middle of a bulk load), then dead by the time the checkpoint visits it (e.g., a forced drop happens after the checkpoint starts). refs |
| Comment by Githook User [ 18/May/15 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Add forced drops to test/fops. We have seen failures where a checkpoint interleaves between a bulk load and a forced drop to try to operate on a dead handle. refs |
| Comment by Githook User [ 27/Apr/15 ] |
|
Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}Message: If getting a handle lock only - don't propogate WT_NOTFOUND. It's expected after the background drop changes. |
| Comment by Githook User [ 27/Apr/15 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Close block manager handles as soon as a handle is marked dead. refs |
| Comment by Githook User [ 27/Apr/15 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Discard trees from cache in the background. We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed. With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on. refs |
| Comment by Bruce Lucas (Inactive) [ 23/Apr/15 ] |
|
Thanks michael.cahill, that build does the trick. The drop command doesn't show up at all in the stack trace sample (because it happened too quickly), a short time later the sweep server is seen spending a bunch of time in __wt_conn_btree_sync_and_close (most of that in free, because this was Windows...), and things like listDatabases and createCollection are not blocked. Ship it! |
| Comment by Githook User [ 16/Apr/15 ] |
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: Discard trees from cache in the background. We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed. With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on. refs |