[SERVER-20986] ***aborting after fassert() failure Created: 26/May/15 Updated: 16/Nov/21 Resolved: 16/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marcos Fernándex | Assignee: | Sam Kleinman (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
uname -a mongo --version Same crash as with 3.0.0, I updated in order to test if an upper version could recover it. |
||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
| Comments |
| Comment by Ramon Fernandez Marina [ 16/Oct/15 ] | |
|
sombra2eternity, we haven't heard back from you after Sam's last comment above, so I'm going to close this ticket. In case this is still an issue for you I've created an upload portal so you can send us the data requested by Sam above. Files can't be larger than 5GB, so you'll need to split anything bigger than that. Alternatively you could upload the WiredTiger.wt and WiredTiger.turtle files and we can attempt to repair those first. Thanks, | |
| Comment by Sam Kleinman (Inactive) [ 27/May/15 ] | |
|
Again, I want to reiterate how sorry we are that you've run into While sharded deployments are not required, we do recommend a replica set as the basis of all production deployments: having an additional copy of your data provides additional assurance against machine failure, data corruption, and storage system errors. While ideally no one will run into bugs like this with compression system, there are some classes of issues that only replication can protect you from. Moving to 3.0.3 was definitely the correct move. The issue that you're seeing now with 3.0.3 is still a byproduct of We're going to convert this issue to an ticket in the SUPPORT project, which is not publicly accessible so we can continue to work on this issue directly. If you want to upload the invalid collection file[s] to the following end point we can attempt to a manual salvage operation. Use the following method to upload your files:
Thanks for your patience, | |
| Comment by Marcos Fernándex [ 27/May/15 ] | |
|
Im not sure if this should be closed, in my report i set mongodb version as 3.0.3 because the failure happen in 3.0.0 but when I found the crash the first thing I did was update mongo with last version. Then I found that once this (maybe patched) error broke the database, even the version 3.0.3 with the --repair flag crash the same way the 3.0.0 did. So this lead me to the conclusion that if the database breaks again the same way, the repair tool will be again unable to even start. My fix to be able to recover the database was to assume I lost the file that was crashing so I renamed it and call repair again, this time mongo crashed in another way (due to file not found but far from an clean exit, it crashed), but before this last (file not found) crash it appears it procesed almost some files. Then, frustrated, renamed again the problematic file and lastly launch repair again, this time the first crash dissapeared and was able to process all files, fortunately. From my pov there are 2 issues unresolved:
For me a crash is unacceptable, if the program exits cleanly with a "sorry I was unable to recover collection x, I will recover the rest" its a diferent story, bus was not the case. Respect your recomendation of getting v3.0.3, was the first thing I did and was not the solution, now the database is running with this version that is not a stable version, that scares me for a production server. I have considered many times getting a replica, but it costs money I do not have currently. I made backups from time to time, but I expect mongo to be at least stable enough to not produce corruption by itself, I have the "ensure write" flags activated all over the place. Im very happy with how mongo works and is sad if it could not work in low environments for simple tasks without 'x' shards and 'y' replicas. This is my second (and diferent) crash report, and starting to think the bad name mongo has is a bit justified. I have a copy (15GB) of the broken and unrepaired database if you want to get a look. Thanks for your time. | |
| Comment by Sam Kleinman (Inactive) [ 27/May/15 ] | |
|
It looks like you've run into a an instance of of Regards, | |
| Comment by Marcos Fernándex [ 26/May/15 ] | |
|
config (markdown broke it in last comment): Hint: I see a shame mongodump stopped using --dbpath in 3.0, because if the client crash (like in this case) you have absolutely no way to dump the non-corrupted data. | |
| Comment by Marcos Fernándex [ 26/May/15 ] | |
|
I was using 3.0.0 stable. config: root@ns364134:~# cat /etc/mongod.conf
Its not the case, im running single node config, mongo should be realiable enough imho. | |
| Comment by Sam Kleinman (Inactive) [ 26/May/15 ] | |
|
Hello, This looks like it might be a case of a zlib compression resolved in
If you were running as a member of a replica set, you can resync this member from another member of the replica set to get a valid copy of the data set. Cheers, |