[SERVER-32199] Fatal Assertion 34433 Created: 07/Dec/17 Updated: 09/Mar/18 Resolved: 07/Dec/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.4.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Arnold Ligtvoet | Assignee: | Mark Agarunov |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
When I checked my logfiles on the primary server I noticed errors that my 3rd server in the 3 server cluster could not be reached. When checking on the server I found mongod was not running, so I tried to restart the service and got a 34433 fatal assertion. This server is running as an arbiter only, so I am a bit surprised to see this WiredTiger error on this machine. The output of /var/log/mongodb/mongod.log: /var/log/mongodb/mongod.log
Running 'mongod --dbpath /var/lib/mongodb --repair' resulted in an aborted repair, but after restarting the mongo service the fatal assertion changed to 28578. /var/log/mongodb/mongod.log after --repair
After removing the '/tmp/mongodb-27017.sock' file and chowning all files in '/var/lib/mongodb' to 'mongodb:mongodb' I am back to 34433 error when restarting the service. I did edit the logfile output and removed only my external IP address. |
| Comments |
| Comment by Mark Agarunov [ 08/Dec/17 ] | ||||||||||
|
Hello aligtvoet, I'm glad to hear you fixed the issue. While arbiters generally do not contain data, they will still attempt to load data in the dbpath if the node contained data in the past (was a secondary or primary at some point) and/or the admin database if it was upgraded from a version earlier than 3.2. The following log line suggests that this node is attempting to load data from the dbpath:
Thanks, | ||||||||||
| Comment by Arnold Ligtvoet [ 08/Dec/17 ] | ||||||||||
|
Hi Mark, thanks for all your help. In the end I did a full resync (which was quick given the role of the server). I'm trying to figure out what happened to the server as I really see no evidence of corruption on disks (does an arbiter even suffer from this as there is no dataset?). I did see a reboot where my /etc/hosts config was lost, so maybe this caused part of the problem in the general sync failing. Arnold. | ||||||||||
| Comment by Mark Agarunov [ 07/Dec/17 ] | ||||||||||
|
Hello aligtvoet, Unfortunately, this error indicates that there was corruption on the disk. In this situation, my best recommendation would be to resync the affected node or restore from a backup if possible. Thanks, | ||||||||||
| Comment by Arnold Ligtvoet [ 07/Dec/17 ] | ||||||||||
|
Hi Mark, Please advice which logfiles (just /var/log/mongodb/mongod.log?) you need and if I can upload them using the same link as before. In answer to your questions: I can see this happen in the logfiles (the ":27017" is not edited by me, was empty in the logfile):
Thanks, | ||||||||||
| Comment by Mark Agarunov [ 07/Dec/17 ] | ||||||||||
|
Hello aligtvoet, Thank you for providing these files. I've attached a repair attempt of the files you've provided. Would you please extract these files and replace them in your $dbpath and let us know if it resolves the issue? If you are still seeing errors after replacing these files, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:
Thanks, | ||||||||||
| Comment by Arnold Ligtvoet [ 07/Dec/17 ] | ||||||||||
|
Thanks Mark for your quick replies. I have uploaded the files. | ||||||||||
| Comment by Mark Agarunov [ 07/Dec/17 ] | ||||||||||
|
Hello aligtvoet, I've created a secure upload portal so that you can send us these files privately. Please note however, that these files don't contain any user data, just metadata for the WiredTiger engine. Thanks, | ||||||||||
| Comment by Arnold Ligtvoet [ 07/Dec/17 ] | ||||||||||
|
Hi Mark, thanks for removing the host name. Can I share these files without making them public as I assume these contain a lot of sensitive info? | ||||||||||
| Comment by Mark Agarunov [ 07/Dec/17 ] | ||||||||||
|
Hello aligtvoet, Thank you for the report. I've removed the hostname from the output as requested. If you can provide the WiredTiger.wt and WiredTiger.turtle files we can attempt a repair of the database, but please keep in mind that this is not a guaranteed fix. Thanks, | ||||||||||
| Comment by Arnold Ligtvoet [ 07/Dec/17 ] | ||||||||||
|
Can you please update the issue asap and remove the server name, as I cannot edit or close the issue. Also the formatting for the panel is quite strange and I cannot update that. |