[SERVER-21710] Allow pinned ClientCursors to be killed on mongod Created: 01/Dec/15 Updated: 29/Jan/18 Resolved: 10/Jan/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 3.7.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | Ian Boros |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Sprint: | Query 2018-01-01, Query 2018-01-15 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
On mongod, a killCursors command or OP_KILL_CURSORS message acting on a pinned cursor will fail to kill the cursor. By contrast, the mongos ClusterCursorManager can kill an alive cursor regardless of whether or not it is pinned. We should change the mongod behavior to be consistent with that of mongos. The killCursors operation should make a best effort to kill all cursors, irrespective of whether the cursor is currently in use. Failing to kill in this edge case opens the possibility of leaked ClientCursors. |
| Comments |
| Comment by Githook User [ 10/Jan/18 ] | ||||
|
Author: {'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}Message: | ||||
| Comment by David Storch [ 08/Nov/17 ] | ||||
|
Yeah, that's known behavior, thanks for the correction. Changing that should fall into the scope of this ticket as well. | ||||
| Comment by Jeffrey Yemin [ 07/Nov/17 ] | ||||
|
FWIW killSessions doesn't work when there is a pinned change stream cursor associated with the session:
| ||||
| Comment by David Storch [ 07/Nov/17 ] | ||||
|
I see. Clients should not attempt to issue killCursors from a separate thread. Since killing pinned cursors has never been supported, killCursors is a way for the thread that is iterating the cursor to safely abandon it without exhausting the result set. It has never been a way to kill an active cursor from another thread. That's what killOp/killSessions are for. $changeStream cursors, and tailable cursors in general, are certainly more susceptible to this problem since cursors are pinned for longer. But it's never a good idea to use killCursors while a getMore may be in progress on versions 3.6 and older. Clearly it would be desirable to allow applications to behave as you describe against future versions of the server. As I alluded to above, we're hoping to schedule this ticket as part of an effort to make killing queries easier, especially queries against sharded clusters. | ||||
| Comment by Jeffrey Yemin [ 07/Nov/17 ] | ||||
|
david.storch a change stream cursor will typically remain pinned for the majority of its lifetime when the collection that is being watched is mostly idle. Furthermore, because drivers generally iterate cursors in a blocking fashion, for tailable cursors there is typically a loop within the call to cursor.next() which repeatedly calls getMore until at least one document is returned or until an error is reported (e.g. cursorNotFound). So if an application contains a thread that is blocking on a change stream cursor, and another thread that attempts to kill that cursor in order to unblock the first thread, the killCursors command can fail with the following error:
This is different from what's reported for a normal tailable cursor in a similar situation:
but the effect is the same: the cursor remains alive, and an application with a thread blocking on the cursor iteration may never exit. Drivers could work around this by inserting within that inner loop that's calling getMore a check of whether the application has at least attempted to close the cursor, and that would catch most of the problems, but currently that's not specified behavior for all drivers. But that has the bad effect of leaving the change stream cursor open on the server until it times out. | ||||
| Comment by David Storch [ 07/Nov/17 ] | ||||
|
jeff.yemin, I'm not sure I follow the relationship between this ticket and change streams, can you elaborate? Note that pinned cursors may be killed with killOp. Also note that we hope to improve cursor-killing behavior for 3.8. Our hope is that in 3.8 users will be able to preemptively kill all cursors belonging to a sharded operation using killSessions. | ||||
| Comment by Jeffrey Yemin [ 06/Nov/17 ] | ||||
|
This appears to be the root cause of | ||||
| Comment by Mathias Stearn [ 07/Feb/17 ] | ||||
|
This will be needed to support abandoning exhaust cursors without closing the connection. |