[SERVER-57179] Mongodb Database corrupted on Windows 10 Created: 25/May/21 Updated: 28/Jul/21 Resolved: 28/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Nandkishor Chavan | Assignee: | Eric Sedor |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
Problem DescriptionWe have found that after unexpected MongoDB service shutdown, the database is getting corrupted and it happens randomly MongoDB Version 4.0.4 2008R2Plus OS: Windows 10 After logging data for a couple of days we are getting data corruption issues. This is resulting in data loss for us. Error in log file:
Additional NotesWe tried to repair the data using repair command. After repair, we are able to access the MongoDB. We have uploaded diagnostic data on Google Drive. Location: https://drive.google.com/file/d/1aMnqwqXSAUhiGbvALLboIXJFr3-jXt_-/view?usp=sharing Additional MongoDB service logs are added in attachment section. We would like to know the reason why this issue might occurs and Is there any thing which we are missing in configuration ? Regards, Nandkishor |
| Comments |
| Comment by Eric Sedor [ 28/Jul/21 ] | |||||||
|
Hi ndchavan4289@gmail.com, I'm going to close this ticket out for now but please reach out again if the issue reoccurs and chkdsk reveals nothing. Thank you! | |||||||
| Comment by Nandkishor Chavan [ 06/Jul/21 ] | |||||||
|
Hi Eric, till now, we have not faced a similar issue again on any site, I will get back to you once I face the issue again. Also, I am not sure how much Procmon will be effective in this case, as we usually don't have the access to sites all the time, so won't be able to run the Procmon at the time when the issue occurs. Once the event has occurred, Procmon won't be able to give us the details on the process which may be causing this issue. Also, if the issue is occurring because of physical disk corruption, then how does it work after repairing. We will still check with chkdsk, if the issue occurs again. | |||||||
| Comment by Eric Sedor [ 01/Jul/21 ] | |||||||
|
Hi ndchavan4289@gmail.com, have you had any success ruling out file access issues via Procmon, or identifying disk issues via chkdsk? | |||||||
| Comment by Eric Sedor [ 10/Jun/21 ] | |||||||
|
It's not precisely clear what deployments are experiencing what symptoms, but it's possible that an attempt to --repair would result in the loss of user metadata if corruption affected a system.users collection. For the latest logs and dbpath you uploaded, the errors to continue to suggest an issue with physical disk corruption or with other processes interfering with files in the MongoDB dbPath while it is running:
In at least one case in the past, one of our users had success using Procmon to identify a process that was causing similar (but not the same) errors. Do you happen to have run chkdsk after the 2021-06-09T10:07:27.319+0530 incident on IntantaDB? | |||||||
| Comment by Nandkishor Chavan [ 09/Jun/21 ] | |||||||
|
Hi Eric, We have not encountered the same issue again on the same site. But, we have encountered similar kind of issue on another site, hence I am attaching the requested files from this new site. We have uploaded archived files for db folder and log files to given location for troubleshooting purpose. Additionally, we have faced one more issue in recent scenario as - We run Mongodb service in authentication mode. However, when we repaired the data, we found that the user we use for authentication was not present in our db. So we had to create it again manually. Please let me know if you need any more information from our side. -Nandkishor | |||||||
| Comment by Eric Sedor [ 04/Jun/21 ] | |||||||
|
Unfortunately we aren't able to readily identify a specific process that could be responsible. Would you be willing to provide the data files for this cluster for examination? This will help us obtain more information about what state MongoDB expects the data files to be in at the time of error. The process would be: 1) After an incident occurs, ensure the server is stopped and archive a copy of the dbpath (including the log files) Files shared to this location are private and will be accessible only by MongoDB employees involved with the investigation. Eric | |||||||
| Comment by Nandkishor Chavan [ 31/May/21 ] | |||||||
|
Hi Eric, Thanks for your response. As per your comment, the virus scanner might be accessing the the files in dbpath. However, that's not the case. The antivirus might be scanning the files in dbpath as usual, but it should not affect/corrupt our db as they are not writing or editing the files. Can you share the process name which is accessing the files in dbpath (with screenshot), so that we can check for that process. | |||||||
| Comment by Eric Sedor [ 27/May/21 ] | |||||||
|
Hi ndchavan4289@gmail.com, I am responding here and closing The logs you've provided suggest that another process on the machine, such as a virus scanner, is accessing files in the dbpath. Can you confirm with certainty that another process is not accessing these files?
Sincerely, EDIT: Sorry for introducing confusion. |