[SERVER-11534] corruption of a secondary memeber Created: 01/Nov/13 Updated: 11/Jul/16 Resolved: 23/Feb/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | 2.4.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker - P1 |
| Reporter: | Julien Bachmann | Assignee: | Bruce Lucas (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
ubuntu 13.04 running on AWS instance with data on an EBS on ext4. |
||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
I have a replicaset with the primary running mongodb 2.2 and a secondary running 2.4.6 and an arbiter. The secondary just crashed and cannot restart due to corruption. Here is the error log:
|
| Comments |
| Comment by Bruce Lucas (Inactive) [ 23/Feb/14 ] | |
|
Hi Julien, As it has been a while since we've heard from you, I'll close this ticket. Please feel free to re-open it if/when you want to pursue a root cause analysis as outlined above. Thanks, | |
| Comment by Bruce Lucas (Inactive) [ 07/Jan/14 ] | |
|
Hi Julien, Happy New Year! Just checking in with you again to find out if you are still interested in pursuing a root cause for the corruption you saw. Thanks, | |
| Comment by Bruce Lucas (Inactive) [ 24/Dec/13 ] | |
|
Hi Julien, Sorry for the delay in responding. Are you still interested in working with us to investigate the problem? To further the investigation we could look at any of the following that you can provide us:
We can provide a secure, private location for uploading this information, just let us know if you want to proceed. The messages are consistent with the data at the location that the _id index points to containing 0s instead of the expected data. One possible cause that we can investigate is whether there was an i/o error that prevented the data from being written. The logs and db files might contain evidence of such an error, or if not may suggest a different cause. Happy Holidays! | |
| Comment by Julien Bachmann [ 05/Nov/13 ] | |
|
I can try to do a repairDatabase but it is not the point. Now I already have a new secondary that I added from a previous backup so it is fine. My concern is that I don't want this issue happen in the future as it will require work to repair the problem every time. I think I should be able to delete lot of data from a database without having any problem. Also I do not make index pointing anywhere. It is mongod stuff. So for me this problem require a fix on your side | |
| Comment by Ranjay Krishna [ 05/Nov/13 ] | |
|
This problem might occur if there is a large amount of data that is recently deleted or if your index is pointing to a location with no object in it. Please run:
on the secondary and let us know if you still encounter the problem. | |
| Comment by Julien Bachmann [ 04/Nov/13 ] | |
|
I just uploded logs of primary and secondary. We can see in primary log that it issue the following info:
and in secondary the crash log tell:
It looks like it is related but not sure. Otherwise primary build info is:
And secondary info is:
And secondary was a fresh new install that was done by following those instructions: | |
| Comment by Julien Bachmann [ 04/Nov/13 ] | |
|
the logs of secondary when it failed | |
| Comment by Julien Bachmann [ 04/Nov/13 ] | |
|
the logs of primary when secondary failed | |
| Comment by Eliot Horowitz (Inactive) [ 02/Nov/13 ] | |
|
Can you send the logs from the primary and secondary at this time? |