[SERVER-16642] Error repairing database Created: 23/Dec/14 Updated: 08/Jan/15 Resolved: 08/Jan/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 2.4.9, 2.6.4 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | carl dong | Assignee: | Ramon Fernandez Marina |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
Hi Team , Could you advise how can I track on the issue ? I am doing an incremental backup on the database , but failed when fetching the oplogs , so I run the repair command and fails again. 2014-12-23T13:53:44.007+0800 [FileAllocator] allocating new datafile /home/ISH/Data/_tmp_repairDatabase_0/local/local.41, filling with zeroes... Thanks in Advance Carl Dong |
| Comments |
| Comment by Ramon Fernandez Marina [ 08/Jan/15 ] |
|
Thanks for the update carl.dong@windfindtech.com, happy to hear that your replica set is working well again. I forgot to ask whether you would have considered uploading your database files; in the absence of system logs pointing to storage issues, analyzing the database files may help determine how the BSON corruption happened. Note also that if this was a hardware issue and you didn't replace your storage the issue may appear again. If that happens feel free to re-open this ticket. Regards, |
| Comment by carl dong [ 07/Jan/15 ] |
|
I don't find any disk error meessage in my system log , so I remove all data files and sync from another node , now my DB works fine . Thanks for your help . |
| Comment by Ramon Fernandez Marina [ 06/Jan/15 ] |
|
carl.dong@windfindtech.com, something that may help would be to search the system logs for error messages pointing to disk issues. That way we can be sure we've found the true cause of the problem. Can you please take a look at your system logs and post any error messages you may find? |
| Comment by Ramon Fernandez Marina [ 05/Jan/15 ] |
|
carl.dong@windfindtech.com, the error message is indicative of data corruption on disk, often caused by flaky storage. At this stage you may need to re-sync from a healthy node after making sure your storage is healthy. |
| Comment by carl dong [ 04/Jan/15 ] |
|
Anyone advise on the issue ? |