[SERVER-4005] killOp() doesn't always kill Created: 03/Oct/11  Updated: 10/Aug/12  Resolved: 17/Dec/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kenny Gorman Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

When a server is experiencing high load a DBA may issue a killOp on a session or multiple sessions in order to try to restore the overall health of the server. In some cases this is needed if the client has timed-out on a query, but the server is still running it. Thus, killOp() is a key tool in the DBA's hands. For instance, let's say a bad query is hitting the live DB, the DBA may killOp() that query over and over in order to not block other queries while the index is created, we have a script that does just that.

However, sometimes when killOp() is run it doesn't actually kill the session, especially under high load and/or high queue activity. It actually appears the killOp() gets queued along with other requests and MongoDB will get to it when it can.

The killOp() operation must have a higher priority than the operations being killed or it's not an effective tool.

I am sorry I don't have logs or more definitive data surrounding the bug, if it's needed please let me know an I can build a test case to illustrate the problem.



 Comments   
Comment by Vinaykr [ 10/Aug/12 ]

Following index specific issue here: SERVER-3067
thanks!

Comment by Ian Whalen (Inactive) [ 10/Aug/12 ]

vinaykr, could you please open a new ticket for this?

Comment by Vinaykr [ 09/Aug/12 ]

We experienced this today with 2.0.6 . We wanted to kill index creation op which was running in foreground basically starved everything else. We tried to kill it but it never died. Can you please test this scenario in case of killing Index ops?

Comment by Kenny Gorman [ 16/Dec/11 ]

In recent versions I have not been able to reproduce this error. I have written a kill agent and it appears to be killing stuff just fine.

You can close this ticket.

Generated at Thu Feb 08 03:04:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.