[SERVER-1341] Rebuilding a large index completely freezes the server and can't be killed Created: 01/Jul/10 Updated: 30/Mar/12 Resolved: 01/Jul/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 1.4.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adam Fields | Assignee: | Eliot Horowitz (Inactive) |
| Resolution: | Duplicate | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
linux 64-bit |
||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
If an index rebuild is kicked off (this happened accidentally to us), it consumes all resources on the server and blocks all other processes, including new connections to the server. This means that if this happens and you don't already have an open shell connection to the server, there's no way to identify the offending op and kill it. It should always be possible to log into the server for admin maintenance. |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 01/Jul/10 ] |
|
the authentication is known - and we'll be working on a fix. mongoid needs to be a bit more explicit with those kind of things, or at least create indexes in the background perhaps |
| Comment by Adam Fields [ 01/Jul/10 ] |
|
I think this is a failing of the intersection between the way mongodb deals with index creation and the way that the orms (mongoid, in this case) do. On startup, mongoid creates all of the indexes it has listed in the models, and normally, these either exist already or get created on the spot. In this case, we'd removed an index, the revised mongoid model without it hadn't yet been deployed to the app server, and one of those processes inadvertently got restarted (probably by monit), which kicked off an index rebuild. We are using authentication, and we were not able to log in after the index build started. This may be a factor of the size of the collection. |
| Comment by Eliot Horowitz (Inactive) [ 01/Jul/10 ] |
|
How could an index rebuild happen accidentally? You should be able to login no matter what. Are you using authentication? |