[SERVER-20923] MemoryMappedFile::remapPrivateView fails with error errno:487 Attempt to access invalid address Created: 14/Oct/15  Updated: 28/Apr/16  Resolved: 11/Jan/16

Status: Closed
Project: Core Server
Component/s: Admin, MMAPv1
Affects Version/s: 3.0.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dave S Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2012 R2 on AWS


Attachments: Text File mongodb.3.0.7.snippet.log     File mongodb.snippet.log.2015-10-14T12-28-16    
Issue Links:
Related
related to SERVER-23977 Failure of MapViewOfFileEx in MemoryM... Closed
Operating System: Windows
Participants:

 Description   

While upgrading our replica set from MongoDB 2.6.8 to 3.0.6, I received the below error. I also tried 3.0.7 today and still received the error. I originally received the error about 10 minutes after upgrading a secondary. I then restarted it and the error occurred about 1 hour later. I then played the restart game for a few more hours and it seemed to keep happening. I ended up performing a db.repairDatabase() on the high volume databases on that server and I thought it fixed the issue - it ran for about 24 hours without a crash. However, today, after updating the binaries from 3.0.6 to 3.0.7 I did receive the error again after running for less than one minute.

I had also tried 3.0.6 on the other secondary, and deleted its data to let it just replicate over, thinking it was a data format issue (since the repair appeared to have helped). However either during the final index build or while catching up with replication, it had the error again.

We are running on Win 2012 R2 on AWS. Our secondaries have 32GB of RAM. We have been using this production DB for a few years now. I currently have the system split between 2.6.8 and 3.0.7 as completing the upgrade puts us at risk of crashing.

Error from a MongoDB 3.0.7 secondary, about 20 seconds after launching it:

2015-10-14T15:36:04.990+0000 I CONTROL  [durability] MapViewOfFileEx for d:/mongo/troidb2.79 failed with error errno:487 Attempt to access invalid address. (file size is 2146435072) in MemoryMappedFile::remapPrivateView
2015-10-14T15:36:04.990+0000 I ACCESS   [conn2239] Successfully authenticated as principal mongodev on troidb2
2015-10-14T15:36:04.990+0000 I -        [durability] Fatal Assertion 16148
 
 
2015-10-14T15:36:05.573+0000 I NETWORK  [initandlisten] connection accepted from 54.204.126.8:58766 #2395 (561 connections now open)
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\stacktrace_win.cpp(175)                       mongo::printStackTrace+0x43
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\log.cpp(134)                                  mongo::logContext+0x97
2015-10-14T15:36:05.579+0000 I NETWORK  [conn2395] end connection 54.204.126.8:58766 (560 connections now open)
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\assert_util.cpp(166)                          mongo::fassertFailed+0x80
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\mmap_win.cpp(425)                             mongo::MemoryMappedFile::remapPrivateView+0x327
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\durable_mapped_file.cpp(76)     mongo::DurableMappedFile::remapThePrivateView+0x2a
2015-10-14T15:36:05.579+0000 I NETWORK  [conn2394] end connection 54.204.247.20:64774 (559 connections now open)
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(388)                    mongo::dur::`anonymous namespace'::remapPrivateViewImpl+0x1e8
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(639)                    mongo::dur::remapPrivateView+0x69
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(826)                    mongo::dur::durThread+0x797
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185)  boost::`anonymous namespace'::thread_start_function+0x21
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] MSVCR120.dll                                                                   beginthreadex+0x107
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] MSVCR120.dll                                                                   endthreadex+0x192
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] KERNEL32.DLL                                                                   BaseThreadInitThunk+0x1a
2015-10-14T15:36:05.579+0000 I CONTROL  [durability] 
2015-10-14T15:36:05.580+0000 I -        [durability] 
 
***aborting after fassert() failure

Error from a MongoDB 3.0.6 secondary performing an inital sync (about 9 hours in):

2015-10-14T09:08:47.910+0000 I CONTROL  [durability] MapViewOfFileEx for d:/mongo/DBNamehidden.27 failed with error errno:487 Attempt to access invalid address. (file size is 2146435072) in MemoryMappedFile::remapPrivateView
2015-10-14T09:08:47.910+0000 I -        [durability] Fatal Assertion 16148
 
 
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\stacktrace_win.cpp(175)                       mongo::printStackTrace+0x43
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\log.cpp(134)                                  mongo::logContext+0x97
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\assert_util.cpp(166)                          mongo::fassertFailed+0x80
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\util\mmap_win.cpp(425)                             mongo::MemoryMappedFile::remapPrivateView+0x327
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\durable_mapped_file.cpp(76)     mongo::DurableMappedFile::remapThePrivateView+0x2a
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(388)                    mongo::dur::`anonymous namespace'::remapPrivateViewImpl+0x1e8
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(639)                    mongo::dur::remapPrivateView+0x69
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\mongo\db\storage\mmap_v1\dur.cpp(826)                    mongo::dur::durThread+0x797
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185)  boost::`anonymous namespace'::thread_start_function+0x21
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] MSVCR120.dll                                                                   beginthreadex+0x107
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] MSVCR120.dll                                                                   endthreadex+0x192
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] KERNEL32.DLL                                                                   BaseThreadInitThunk+0x1a
2015-10-14T09:08:51.876+0000 I CONTROL  [durability] 
2015-10-14T09:08:51.876+0000 I -        [durability] 
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 11/Jan/16 ]

Thanks for getting back me radardave. I'm going to close this ticket for the time being, since we unfortunately don't have enough information to troubleshoot. If you get a crash dump please ping us on this ticket and we'll reopen for further investigation.

Regards,
Ramón.

Comment by Dave S [ 11/Jan/16 ]

I did end up restarting the server once recently and received the crash again, but just restarted it and it has been ok since. I haven't been able to get a crash dump yet.

Thanks,
Dave

Comment by Ramon Fernandez Marina [ 09/Jan/16 ]

radardave, have you had any more crashes? If you do in the future, the upload portal will be available for a bit longer to send a crash dump for us to investigate – please let us know if you run into this issue again.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 23/Nov/15 ]

radardave, the information you sent points to a possible race condition during remapPrivateView, but without a crash dump unfortunately there's not enough information for us to investigate.

If you're willing to reproduce the problem, would you please generate a crash dump and upload it here? I found an easy recipe online, but please let us know if you have any questions on how to go about this.

Please let us know.

Thanks,
Ramón.

Comment by Dave S [ 06/Nov/15 ]

So far, I'm still running one of our Secondaries on 3.0.7. What I found is once you make it past a day or so, the chances of this error occurring are significantly lower. However, I'm almost certain that if I were to restart the MongoDB service or reboot the server I would be back in the state described above, where it crashes and I have to keep starting it until I get lucky and it ends up in a good state.

Please let me know if there's anything else you need from me.

Comment by Dave S [ 19/Oct/15 ]

{
        "version" : "3.0.7",
        "gitVersion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd",
        "targetMinOS" : "Windows 7/Windows Server 2008 R2",
        "OpenSSLVersion" : "OpenSSL 1.0.1p-fips 9 Jul 2015",
        "sysInfo" : "windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49",
        "versionArray" : [
                3,
                0,
                7,
                0
        ],
        "loaderFlags" : "",
        "compilerFlags" : "/TP /nologo /EHsc /W3 /wd4355 /wd4800 /wd4267 /wd4244 /wd4290 /wd4068 /wd4351 /we4099 /we4930 /Z7 /errorReport:none /MD /O2 /Oy-
w /Gy /Zc:inline",
        "allocator" : "tcmalloc",
        "javascriptEngine" : "V8",
        "bits" : 64,
        "debug" : false,
        "maxBsonObjectSize" : 16777216,
        "ok" : 1
}

Comment by Ramon Fernandez Marina [ 18/Oct/15 ]

radardave, can you please send the output of db.serverBuildInfo() for the affected node(s)?

Generated at Thu Feb 08 03:55:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.