[SERVER-9355] Mongodb crashed after - FlushViewOfFile for F:/data/xq.4 failed with error 1117 after 1 attempts taking 7076 ms Created: 15/Apr/13 Updated: 10/Dec/14 Resolved: 23/May/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | 2.2.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | John Woakes | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Azure Worker Role - 3 replica set databases on 3 Azure instances |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | Windows | ||||||||
| Steps To Reproduce: | This appears to be a random event. |
||||||||
| Participants: | |||||||||
| Description |
|
This happened in our production system in the middle of the night on a Sunday morning. I cannot see anything that might of triggered this. The database did recover once we restarted mongod. See attached log file for more details. This is failure extracted from that file.
|
| Comments |
| Comment by John Woakes [ 23/May/13 ] |
|
I am not sure what information to post that is not already here. The support number is 113041610370474 and the agent's name is Mike Wong. I will update this ticket if I get more information. |
| Comment by Daniel Pasette (Inactive) [ 23/May/13 ] |
|
Thanks for the update John. If you're able to post information we can use to track this issue here in this ticket, it would be much appreciated. Closing this as an active MongoDB issue for now as I don't think there's anything that can be done on the MongoDB side. |
| Comment by John Woakes [ 22/May/13 ] |
|
Finally got this from Microsoft Azure Support... |
| Comment by John Woakes [ 24/Apr/13 ] |
|
We are still having Mongo crash or get into an unstable condition nearly daily. I have opened a ticket with Azure and they are trying to get to the root of this. I have been working with them on this. Hopefully will get an answer soon. |
| Comment by Stennie Steneker (Inactive) [ 24/Apr/13 ] |
|
Hi John, Do you have any update on this issue? Thanks, |
| Comment by John Woakes [ 18/Apr/13 ] |
|
This is the latest from MS So I installed debugdiag which is set to write a dump file if mongod.exe crashes. Debugdiag is on all three instances of MongoWorker. Hopefully, we’ll get a dump soon to analyze. I will check on it again later today and tomorrow morning. |
| Comment by John Woakes [ 16/Apr/13 ] |
|
Thanks Dan, I have open a ticket with Azure support. I will let you know what happens. |
| Comment by Daniel Pasette (Inactive) [ 16/Apr/13 ] |
|
I see in the EventLogs you've posted the following event for the ProviderName "WaDrivePrt" at timestamp: 2013-04-14T07:15:41.7262866Z: '/mongoddblob0.vhd' failed to renew lease the specified XDisk. This is a about 90 secs after the crash occurred while flushing to disk. This appears to be an issue with the disk subsystem in Azure. Have you tried alerting Azure support with the same details you've included in this ticket? |
| Comment by John Woakes [ 15/Apr/13 ] |
|
This is the Windows Event Logs from the period. |
| Comment by Tad Marshall [ 15/Apr/13 ] |
|
Error 1117 is ERROR_IO_DEVICE: The request could not be performed because of an I/O device error. Can you see if there is anything in the Windows or Azure event logs corresponding to this error? |