[SERVER-21997] kill_cursors.js deadlocks Created: 21/Dec/15 Updated: 16/Aug/22 Resolved: 22/Dec/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1, Querying |
| Affects Version/s: | None |
| Fix Version/s: | 3.2.3, 3.3.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Eric Milkie | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | test-only | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Completed: | |||||||||||||
| Sprint: | QuInt E (01/11/16) | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
In MMAP tests, the new failpoint in kill_cursors.js can deadlock with journal flush, since it is spinning in a tight while loop while holding a database lock. Example callstacks: |
| Comments |
| Comment by Githook User [ 11/Jan/16 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: (cherry picked from commit cff8decf7ecebb69f82231c994a8b1a52234ba08) |
| Comment by Githook User [ 22/Dec/15 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |
| Comment by David Storch [ 22/Dec/15 ] |
|
kill_cursors.js includes a test for killing a pinned cursor. Since cursors are generally pinned for short periods of time (i.e. the time required to complete the getMore operation against that cursor), the test enables a fail point which causes the getMore thread to busy wait after pinning the cursor. This can cause deadlock as follows:
|
| Comment by Mark Benvenuto [ 22/Dec/15 ] |
|
It also hangs on my PPC64le. The blocking thread is holding a lock that causes the killCursor, clientCursorManager, and TTLMonitor to wait on. |