[SERVER-20923] MemoryMappedFile::remapPrivateView fails with error errno:487 Attempt to access invalid address Created: 14/Oct/15 Updated: 28/Apr/16 Resolved: 11/Jan/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Admin, MMAPv1 |
| Affects Version/s: | 3.0.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dave S | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows Server 2012 R2 on AWS |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | Windows | ||||||||
| Participants: | |||||||||
| Description |
|
While upgrading our replica set from MongoDB 2.6.8 to 3.0.6, I received the below error. I also tried 3.0.7 today and still received the error. I originally received the error about 10 minutes after upgrading a secondary. I then restarted it and the error occurred about 1 hour later. I then played the restart game for a few more hours and it seemed to keep happening. I ended up performing a db.repairDatabase() on the high volume databases on that server and I thought it fixed the issue - it ran for about 24 hours without a crash. However, today, after updating the binaries from 3.0.6 to 3.0.7 I did receive the error again after running for less than one minute. I had also tried 3.0.6 on the other secondary, and deleted its data to let it just replicate over, thinking it was a data format issue (since the repair appeared to have helped). However either during the final index build or while catching up with replication, it had the error again. We are running on Win 2012 R2 on AWS. Our secondaries have 32GB of RAM. We have been using this production DB for a few years now. I currently have the system split between 2.6.8 and 3.0.7 as completing the upgrade puts us at risk of crashing. Error from a MongoDB 3.0.7 secondary, about 20 seconds after launching it:
Error from a MongoDB 3.0.6 secondary performing an inital sync (about 9 hours in):
|
| Comments |
| Comment by Ramon Fernandez Marina [ 11/Jan/16 ] | ||||||||||||||||||||||
|
Thanks for getting back me radardave. I'm going to close this ticket for the time being, since we unfortunately don't have enough information to troubleshoot. If you get a crash dump please ping us on this ticket and we'll reopen for further investigation. Regards, | ||||||||||||||||||||||
| Comment by Dave S [ 11/Jan/16 ] | ||||||||||||||||||||||
|
I did end up restarting the server once recently and received the crash again, but just restarted it and it has been ok since. I haven't been able to get a crash dump yet. Thanks, | ||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 09/Jan/16 ] | ||||||||||||||||||||||
|
radardave, have you had any more crashes? If you do in the future, the upload portal will be available for a bit longer to send a crash dump for us to investigate – please let us know if you run into this issue again. Regards, | ||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 23/Nov/15 ] | ||||||||||||||||||||||
|
radardave, the information you sent points to a possible race condition during remapPrivateView, but without a crash dump unfortunately there's not enough information for us to investigate. If you're willing to reproduce the problem, would you please generate a crash dump and upload it here? I found an easy recipe online, but please let us know if you have any questions on how to go about this. Please let us know. Thanks, | ||||||||||||||||||||||
| Comment by Dave S [ 06/Nov/15 ] | ||||||||||||||||||||||
|
So far, I'm still running one of our Secondaries on 3.0.7. What I found is once you make it past a day or so, the chances of this error occurring are significantly lower. However, I'm almost certain that if I were to restart the MongoDB service or reboot the server I would be back in the state described above, where it crashes and I have to keep starting it until I get lucky and it ends up in a good state. Please let me know if there's anything else you need from me. | ||||||||||||||||||||||
| Comment by Dave S [ 19/Oct/15 ] | ||||||||||||||||||||||
| ||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 18/Oct/15 ] | ||||||||||||||||||||||
|
radardave, can you please send the output of db.serverBuildInfo() for the affected node(s)? |