Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18014

Dropping a collection can block creating a new collection for an extended time under WiredTiger

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.1
    • Fix Version/s: 3.0.3, 3.1.3
    • Component/s: WiredTiger
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Backport Completed:

      Description

      Issue Status as of Apr 29, 2015

      ISSUE SUMMARY
      Because of an interaction between the process for dropping and creating collections with the WiredTiger storage engine, the process of dropping a collection and freeing resources used by that collection can block creating new collections.

      USER IMPACT
      Workloads that involve large volumes of collection drop operations may experience performance degredation.

      WORKAROUNDS
      None.

      AFFECTED VERSIONS
      3.0.0, 3.0.1, 3.0.2

      FIX VERSION
      The fix is included in the 3.0.3 production release.

      Original description

      Dropping a collection can take a long time under WiredTiger because can require freeing a large number of allocated buffers. See SERVER-17907 for information on reproducing.

      While this is occuring creating a new collection may be blocked for a substantial time. The create was accomplished in this test by inserting a record into a non-existent collection, so the stack traces below show the createCollection happening within insertOne. It is blocked in two different places:

      • From A to B it's blocked waiting for the db lock. This occurs synchronously with the drop command itself, which is busy freeing buffers.
      • From B to C it's blocked waiting to update the metadata table. This occurs while synchronously with dropAllQueued (not shown in the screenshot below), which is also busy freeing buffers.
        Total timeline below is about 3.5 minutes, so total time blocked is about 2 minutes.

      1. procdump.html
        443 kB
        Bruce Lucas
      1. create.png
        496 kB
      2. freeing.png
        358 kB
      3. linux-02.png
        165 kB
      4. linux-03.png
        197 kB
      5. patch-56-sync-drop.png
        338 kB

        Issue Links

          Activity

          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Discard trees from cache in the background.

          We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed.

          With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on.

          refs SERVER-17907, SERVER-18014
          Branch: tree-discard-background
          https://github.com/wiredtiger/wiredtiger/commit/440cbc76902432eb233b8a8bda1df1265bdd6e46

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Discard trees from cache in the background. We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed. With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on. refs SERVER-17907 , SERVER-18014 Branch: tree-discard-background https://github.com/wiredtiger/wiredtiger/commit/440cbc76902432eb233b8a8bda1df1265bdd6e46
          Hide
          bruce.lucas Bruce Lucas added a comment -

          Thanks Michael Cahill, that build does the trick. The drop command doesn't show up at all in the stack trace sample (because it happened too quickly), a short time later the sweep server is seen spending a bunch of time in __wt_conn_btree_sync_and_close (most of that in free, because this was Windows...), and things like listDatabases and createCollection are not blocked. Ship it!

          Show
          bruce.lucas Bruce Lucas added a comment - Thanks Michael Cahill , that build does the trick. The drop command doesn't show up at all in the stack trace sample (because it happened too quickly), a short time later the sweep server is seen spending a bunch of time in __wt_conn_btree_sync_and_close (most of that in free, because this was Windows...), and things like listDatabases and createCollection are not blocked. Ship it!
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Discard trees from cache in the background.

          We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed.

          With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on.

          refs SERVER-17907, SERVER-18014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/440cbc76902432eb233b8a8bda1df1265bdd6e46

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Discard trees from cache in the background. We used to keep handles locked while freeing their pages from cache (either for drops or when sweeping old handles). If an application thread attempted to open a cursor during one of these operations, it was forced to wait until the discard completed. With this change, handles are marked "dead", and readers will no longer use them. The sweep server will later discard dead trees from cache in the background, without holding any locks that application threads should block on. refs SERVER-17907 , SERVER-18014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/440cbc76902432eb233b8a8bda1df1265bdd6e46
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Close block manager handles as soon as a handle is marked dead.

          refs SERVER-18014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/f1d1d01d0759263b8f18719f63f08696a15e9f91

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Close block manager handles as soon as a handle is marked dead. refs SERVER-18014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/f1d1d01d0759263b8f18719f63f08696a15e9f91
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: If getting a handle lock only - don't propogate WT_NOTFOUND.

          It's expected after the background drop changes.
          Refs SERVER-18014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/5991f88fefb1a5989f9b3633b7cd5c0dc1d57854

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: If getting a handle lock only - don't propogate WT_NOTFOUND. It's expected after the background drop changes. Refs SERVER-18014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/5991f88fefb1a5989f9b3633b7cd5c0dc1d57854
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Add forced drops to test/fops. We have seen failures where a checkpoint interleaves between a bulk load and a forced drop to try to operate on a dead handle.

          refs SERVER-18014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/80cc29d3531500a529cf044f9419195fe54f9a47

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Add forced drops to test/fops. We have seen failures where a checkpoint interleaves between a bulk load and a forced drop to try to operate on a dead handle. refs SERVER-18014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/80cc29d3531500a529cf044f9419195fe54f9a47
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Don't try to checkpoint dead handles.

          If a handle is busy when a checkpoint starts (e.g., in the middle of a bulk load), then dead by the time the checkpoint visits it (e.g., a forced drop happens after the checkpoint starts).

          refs SERVER-18014
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/c207b64a1f7be98ea65a3ae95407c4da1353498a

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Don't try to checkpoint dead handles. If a handle is busy when a checkpoint starts (e.g., in the middle of a bulk load), then dead by the time the checkpoint visits it (e.g., a forced drop happens after the checkpoint starts). refs SERVER-18014 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/c207b64a1f7be98ea65a3ae95407c4da1353498a

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: