[SERVER-5512] All the connections were filled up after an assertion occurred. Had to reboot Mongo to be able to reconnect to the server.s Created: 05/Apr/12  Updated: 15/Aug/12  Resolved: 20/Apr/12

Status: Closed
Project: Core Server
Component/s: Internal Code, Stability
Affects Version/s: 1.8.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Chris Weber Assignee: Tad Marshall
Resolution: Duplicate Votes: 1
Labels: connection
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

VM Ware - Windows 2008 R2 64 bit - 4 GB RAM - 2 Ghz processor - Mongo 64 bit

C# Driver - 1.1.0.28681

Mongo has journaling enabled


Issue Links:
Duplicate
duplicates SERVER-2942 MapViewOfFileEx failed during large i... Closed
Operating System: Windows
Participants:

 Description   

I have been running Mongo for the last 6 months without an issue but the last two Tuesdays it stopped allowing for new connections to be established. When I looked at the log I noticed that after the Assertion, a bunch of connections were opened. At that point our application could no longer connect to Mongo.

Sample Log File
Tue Apr 03 10:19:18 [conn2766] MapViewOfFileEx failed c:/mongodb/data/TED_LOGS/TED_LOGS.1 errno:487 Attempt to access invalid address.
Tue Apr 03 10:19:18 [conn2766] Assertion failure p db\mongommf.cpp 198
Tue Apr 03 10:19:19 [conn2766] update ted.nfSession query:

{ sid: "eogx0tMwLkhwbxhJgyIo1LGW1UeIQUoAv7Jen1D2RI8NPovuEUbqfm8ke5qfzgO+1fx9eB..." }

exception 0 assertion db\mongommf.cpp:198 0ms
Tue Apr 03 10:19:20 [initandlisten] connection accepted from 10.20.60.10:59606 #2773
Tue Apr 03 10:19:20 [initandlisten] connection accepted from 10.20.60.10:59609 #2774
.
.
.

Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
Tue Apr 03 10:19:48 [conn2766] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: ted.nfSession
.
.
.
Tue Apr 03 10:35:14 [initandlisten] connection accepted from 10.20.60.10:3510 #4888
Tue Apr 03 10:35:16 [initandlisten] connection accepted from 10.20.60.10:3519 #4889



 Comments   
Comment by Tad Marshall [ 20/Apr/12 ]

Duplicate of SERVER-2942.

Comment by Chris Weber [ 05/Apr/12 ]

Tad,

Thanks for the description that gives me a great idea what occurred.

Chris

Comment by Tad Marshall [ 05/Apr/12 ]

2.0.4 won't hit the MapViewOfFileEx failure but it will consume page file space that the 1.8.5 version didn't. (All of this is only when journaling is on). We think we will have both issues solved in 2.2, but the code isn't finished yet (my fault, I'm taking too long).

The MapViewOfFileEx failure in 1.8.5 is most likely when new connections are being made rapidly at the same time that updates or inserts are happening ... all pretty normal activities for a mongod.exe server. The page file consumption in 2.0.4 is greater with heavy inserts and updates ... basically, all the stuff that gets written to the journal. Heavy query load will have no effect on page file usage.

Comment by Chris Weber [ 05/Apr/12 ]

That makes sense.

I have upgraded to 2.0.4. However, does that help the problem or is it just likely to happen as the 1.8.5 version? Also, does server load cause the error to occur more often?

Chris W.

Comment by Tad Marshall [ 05/Apr/12 ]

Sorry you hit this problem.

The MapViewOfFileEx failure is essentially a fatal error, but your version doesn't exit as it probably should.

What happens when you see that error is that a memory-mapped copy of your database file (c:/mongodb/data/TED_LOGS/TED_LOGS.1 in this case) is first unmapped and then an attempt is made to remap it at the same memory address where it was before. Usually, the remap succeeds, but once in a while another thread gets scheduled in between the unmap and remap operations and allocates memory (for a thread stack, for example) and this memory is in the area where MapViewOfFileEx wants to remap the database file. This causes MapViewOfFileEx to fail, and mongod.exe can't properly recover.

Restarting mongod.exe is the only thing you can do at that point.

We expect to have a permanent fix for this in version 2.2. In the meantime, you could upgrade to 2.0.4 or just restart mongod.exe when this happens.

Generated at Thu Feb 08 03:09:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.