[SERVER-1341] Rebuilding a large index completely freezes the server and can't be killed Created: 01/Jul/10  Updated: 30/Mar/12  Resolved: 01/Jul/10

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.4.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adam Fields Assignee: Eliot Horowitz (Inactive)
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

linux 64-bit


Issue Links:
Depends
depends on SERVER-679 Authentication should be non-blocking Closed
depends on SERVER-787 allow reIndex() to have option for ba... Closed
Operating System: ALL
Participants:

 Description   

If an index rebuild is kicked off (this happened accidentally to us), it consumes all resources on the server and blocks all other processes, including new connections to the server. This means that if this happens and you don't already have an open shell connection to the server, there's no way to identify the offending op and kill it.

It should always be possible to log into the server for admin maintenance.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 01/Jul/10 ]

the authentication is known - and we'll be working on a fix.

mongoid needs to be a bit more explicit with those kind of things, or at least create indexes in the background perhaps

Comment by Adam Fields [ 01/Jul/10 ]

I think this is a failing of the intersection between the way mongodb deals with index creation and the way that the orms (mongoid, in this case) do. On startup, mongoid creates all of the indexes it has listed in the models, and normally, these either exist already or get created on the spot. In this case, we'd removed an index, the revised mongoid model without it hadn't yet been deployed to the app server, and one of those processes inadvertently got restarted (probably by monit), which kicked off an index rebuild.

We are using authentication, and we were not able to log in after the index build started. This may be a factor of the size of the collection.

Comment by Eliot Horowitz (Inactive) [ 01/Jul/10 ]

How could an index rebuild happen accidentally?

You should be able to login no matter what.

Are you using authentication?

Generated at Thu Feb 08 02:56:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.