[SERVER-3067] Can't kill indexing operations Created: 09/May/11 Updated: 28/Oct/15 Resolved: 09/Nov/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | 1.8.0 |
| Fix Version/s: | 2.3.1 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Aaron Westendorf | Assignee: | Aaron Staple |
| Resolution: | Done | Votes: | 12 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 10.04.2 |
||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Overview: This ticket adds support for using killop to interrupt a client initiated foreground index build in progress. When the index build is killed, all resources related to the index are cleaned up and the index is removed from system.indexes. An error response is provided on the connection that initiated the index build. A foreground index build will similarly be killed if mongod is shut down while the build is in progress. Index builds that are not directly initiated by an external client cannot be interrupted in this manner. For example, map reduce and reindex (as well as other commands) build indexes as part of their internal implementations and cannot be interrupted currently. Aaron ---------------------- I accidentally started a foreground index build on 300 million records. I followed the documentation and tried to kill it, but was unsuccessful. This resulted in significant downtime. http://www.mongodb.org/display/DOCS/Viewing+and+Terminating+Current+Operation { }, I issued the command: The console printed out that it was trying to kill the process, but it never did. Once the master finally completed, after over an hour, we had to wait for the replica set slaves to also build the index. The overall outage window lasted several hours but could have been avoided if I could kill the job. |
| Comments |
| Comment by Jalmari Raippalinna [ 19/Nov/13 ] |
|
Just happened to browse by this issue, and had to pitch in. We killed background indexing task about month ago and this corrupted the database with invalid BSONSize error. Cannot find any logs about that anymore, but just a note that it can happen. Collection already had existing sparse & unique indexes, and this was an additional index to the collection. |
| Comment by auto [ 09/Nov/12 ] |
|
Author: {u'date': u'2012-11-09T04:51:57Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by auto [ 09/Nov/12 ] |
|
Author: {u'date': u'2012-11-09T03:15:03Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by auto [ 09/Nov/12 ] |
|
Author: {u'date': u'2012-11-09T02:51:33Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by auto [ 09/Nov/12 ] |
|
Author: {u'date': u'2012-11-09T01:19:32Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by auto [ 09/Nov/12 ] |
|
Author: {u'date': u'2012-10-25T21:07:23Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by auto [ 24/Oct/12 ] |
|
Author: {u'date': u'2012-10-23T18:19:29-07:00', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by Vinaykr [ 09/Aug/12 ] |
|
We had a similar issue where foreground index creation froze the entire cluster and starved all other read/write ops. It should not be so easy to get into this stage. The default option for index creation should be changed to "background always" because for any reasonably sized production deployment, foreground index creation never makes any sense. If it's a small sized cluster/db then even background index creation will complete quickly. So, I think the default doesnt make sense. |
| Comment by Colin Howe [ 18/Jul/12 ] |
|
Hi Ian, I'd like to stress how important this is for us. We can have all the replication under the sun, all the multi-DC writes you can shake a stick at... but it is all for nought if I can accidentally lock out an entire replication set by an accidental omission of background: true or a migration accidentally not performed in office hours... The fact it then also replicates the index build command to replicas just makes this even worse. This, or the suggestion of preventing foreground index builds, would go a long way to making us believe that MongoDB is a robust solution. Thanks, |
| Comment by Ian Whalen (Inactive) [ 17/Jul/12 ] |
|
Hi all, we're still in the planning stages for 2.4 and will evaluate getting this in, although I can't guarantee it will make it. |
| Comment by Bar Ziony [ 17/Jul/12 ] |
|
Are there any updates from 10gen people about this? Thanks! |
| Comment by bahadir cambel [ 13/May/12 ] |
|
db version v2.0.5 I've even realized that you can not also shutdown server to kill the indexing operation. db.shutdownServer() Sun May 13 00:57:58 uncaught exception: assert failed : unexpected error: "shutdownServer failed: db assertion failure" |
| Comment by Rafael Calsaverini [ 03/May/12 ] |
|
I've got the same problem in 1.8.2. Luckily it wasn't in a production environment, but it blocked my work for a couple hours. I have two doubts: 1) Is this corrected in newer versions? |
| Comment by Colin Howe [ 15/Mar/12 ] |
|
Hey, getting this in would be amazing. It's an absolute killer if someone accidentally (in code or manually) creates a foreground index and brings down an entire site without any option apart from a failover.. |
| Comment by Zac Witte [ 01/Mar/12 ] |
|
Any updates to this? I still have the problem in 2.0.2 Thu Mar 1 10:33:18 [initandlisten] connection accepted from 127.0.0.1:38442 #3 |
| Comment by Aaron Westendorf [ 16/May/11 ] |
|
I tested background indexing and found the following behavior:
From this, I think it's the case that the lock on collections which an index grabs precludes other operations from running, possibly including the kill operation. |