[SERVER-9330] Error 10092 during initial sync Created: 11/Apr/13 Updated: 10/Dec/14 Resolved: 05/Mar/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Noah Davis | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | crash, replica, replicaset | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
GNU/Linux |
||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
During our initial sync, our mongodb instance acting as a replica crashed. I've attached the result of running "rs.conf()" and "rs.status()" on the master as well as the full mongod.log from the instance that crashed. |
| Comments |
| Comment by Tyler Brock [ 24/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||
|
Bryan that's perfectly fine to do as long as you are sure that since the killing of compact you don't have any duplicate id's you've introduced in some way (as the dropDups: true will drop them). I would also add background: true so that it doesn't block other database activities. If you can, it would be advisable to run compact to completion at some point given this other note in the documentation: "Much of the existing free space in the collection may become un-reusable. In this scenario, you should rerun the compaction to completion to restore the use of this free space." | ||||||||||||||||||||||||||||||||||||||
| Comment by Bryan Helmkamp [ 18/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||
|
I think I remember now what happened to this collection, as the _id index is not supposed to be removable (according to http://docs.mongodb.org/manual/core/indexes/). IIRC I ran a compact (http://docs.mongodb.org/manual/reference/command/compact/) operation against this collection at one point but aborted using db.killOp(). The compact docs note: "If you terminate the operation with the db.killOp() [...] You may have to manually rebuild the indexes." Should we just run ensureIndex({_id: 1}, {unique: true, dropDups: true})? Or is that not a good idea because the _id index is special? | ||||||||||||||||||||||||||||||||||||||
| Comment by Bryan Helmkamp [ 16/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||
|
You'll notice the source_files collection is missing an index on _id. This is not intentional – I'm not sure how it got in that state. I vaguely remember now that at one point I may have accidentally deleted it. (Fortunately, we don't really look source_files up by _id). The _ids are ObjectIds. Do we know if the dupes issue is with _id or the other index (which is a unique index)? Is there a way we can fix the source_files collection on the primary in order to make it sync-able? | ||||||||||||||||||||||||||||||||||||||
| Comment by Bryan Helmkamp [ 16/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||
|
Thanks, Stephen.
| ||||||||||||||||||||||||||||||||||||||
| Comment by Stennie Steneker (Inactive) [ 16/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi Noah, Based on the mongod.log, it appears that the last action underway was an index build:
... which encountered an exception:
.. followed by a fatal assertion trying to restart initialsync. I noticed there were several attempts to sync in the logs (including one where mongod ran out of available disk space). Did you remove the files in the data directory after the failed initialsync attempts (i.e. before attempting the upgrade again)? Can you provide some more background on this replica set:
Thanks, |