[SERVER-2942] MapViewOfFileEx failed during large import Created: 13/Apr/11  Updated: 12/Jul/16  Resolved: 07/Jun/12

Status: Closed
Project: Core Server
Component/s: Internal Code, Storage
Affects Version/s: 1.8.0, 1.8.1
Fix Version/s: 2.0.7, 2.1.2

Type: Bug Priority: Major - P3
Reporter: Conor McCarthy Assignee: Tad Marshall
Resolution: Done Votes: 2
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

64bit mongodb sever
Windows 2003 x64 SP2
Dell PowerEdge 2900 III.
24GB RAM
15k SAS drive
2.4Ghz quad core

DB client and server running on same box. DB files on local harddrive.


Issue Links:
Depends
depends on SERVER-4988 printStackTrace on Windows Closed
Duplicate
is duplicated by SERVER-5512 All the connections were filled up af... Closed
is duplicated by SERVER-5901 Assertion failure p db\mongommf.cpp 1... Closed
Related
related to SERVER-4522 inserts take long time after several ... Closed
related to SERVER-10516 mongod crashes when re-mapping above ... Closed
related to SERVER-5663 Private pages accumulate in Windows R... Closed
related to SERVER-14840 FAssert on windows after long running... Closed
is related to SERVER-3403 Server crash - MapViewOfFileEx failed... Closed
is related to SERVER-5680 repl13.js failing on Windows 64-bit Closed
Operating System: Windows
Participants:

 Description   

I was running a large batch of inserts and mongod crashed with the
below error. It's running on Windows 2003 x64.
Tue Apr 12 20:12:55 [conn9] MapViewOfFileEx failed H:/mongo/data/
TestDB.ns errno:487 Attempt to access invalid address.
Tue Apr 12 20:12:55 [conn9] TestDB.Hands Assertion failure p db
\mongommf.cpp 198
Tue Apr 12 20:12:56 [conn9] dbexception in groupCommit causing
immediate shutdown: 0 assertion db\mongommf.cpp:198



 Comments   
Comment by Tad Marshall [ 21/Jun/12 ]

Backported to 2.0.7.

Comment by auto [ 21/Jun/12 ]

Author:

{u'date': u'2012-06-20T14:47:51-07:00', u'email': u'tad@10gen.com', u'name': u'Tad Marshall'}

Message: SERVER-5663, SERVER-2942 MapViewOfFileEx backport

Backport the fixes for Windows memory-mapped files made in
2.1.x into the 2.0.x branch. Stop using VirtualProtect in
remapPrivateView, use UnmapViewOfFile and MapViewOfFileEx
to refresh the private view of the memory-mapped file.
Place memory-mapped files at a high address in 64-bit to
get out of the way of allocations made by Windows.
Branch: v2.0
https://github.com/mongodb/mongo/commit/881d6229e1ab58695959862a76f3175e142af84e

Comment by Andy Schwerin [ 07/Jun/12 ]

To backport this to 2.0, we'd need to backport b46ec68eb5c78ae645f31016c956720fe98dcc12 and also d8462a26b9089c5e58d1e340dcada83719ea4e47.

Comment by Tad Marshall [ 07/Jun/12 ]

Calling it "fixed" (again). Placing files at or above 256 GB seems to get them out of the way of other allocations.

Comment by auto [ 07/Jun/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-2942 For x64, place memory-mapped files at higher addresses

Use MapViewOfFileEx instead of MapViewOfFile when initially mapping
files into memory, and specify the address we want (x64 only). For
32-bit, pass a zero for the desired address so it behaves like
MapViewOfFile and lets Windows pick the address. Place files starting
at 256 GB in the virtual address space, pick successive locations
based on file size rounded up to memory allocation granularity and
don't try to reuse addresses.
Branch: master
https://github.com/mongodb/mongo/commit/b46ec68eb5c78ae645f31016c956720fe98dcc12

Comment by Tad Marshall [ 02/Jun/12 ]

Mr. BigObject strikes again.

http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/4747/steps/test_6/logs/stdio

Sat Jun 02 00:20:42 [initandlisten] MongoDB starting : pid=5820 port=27999 dbpath=/data/db/sconsTests 64-bit host=ip-0A08A111
Sat Jun 02 00:20:42 [initandlisten] 
Sat Jun 02 00:20:42 [initandlisten] ** NOTE: This is a development version (2.1.2-pre-) of MongoDB.
Sat Jun 02 00:20:42 [initandlisten] **       Not recommended for production.
Sat Jun 02 00:20:42 [initandlisten] 
Sat Jun 02 00:20:42 [initandlisten] db version v2.1.2-pre-, pdfile version 4.5
Sat Jun 02 00:20:42 [initandlisten] git version: 9ce4fc4eebc2e6f8a388fc3205bee0d463696354
Sat Jun 02 00:20:42 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
Sat Jun 02 00:20:42 [initandlisten] options: { dbpath: "/data/db/sconsTests/", port: 27999 }
Sat Jun 02 00:20:42 [initandlisten] journal dir=/data/db/sconsTests/journal
Sat Jun 02 00:20:42 [initandlisten] recover : no journal files present, no recovery needed
Sat Jun 02 00:20:42 [initandlisten] waiting for connections on port 27999
//...snip...
 *******************************************
         Test : mr_bigobject.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit\mongo\mongo.exe --port 27999 D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit\\mongo\\jstests\\mr_bigobject.js";TestData.testFile = "mr_bigobject.js";TestData.testName = "mr_bigobject";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Sat Jun  2 00:33:37 2012
Sat Jun 02 00:33:37 [conn31] end connection 127.0.0.1:57798 (0 connections now open)
MongoDB shell version: 2.1.2-pre-
connecting to: 127.0.0.1:27999/test
Sat Jun 02 00:33:37 [initandlisten] connection accepted from 127.0.0.1:57799 #32 (1 connection now open)
null
Sat Jun 02 00:33:37 [conn32] CMD: drop test.mr_bigobject
Sat Jun 02 00:33:51 [conn32] build index test.mr_bigobject { _id: 1 }
Sat Jun 02 00:33:51 [conn32] build index done.  scanned 0 total records. 0 secs
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Sat Jun 02 00:33:51 [conn32] build index test.tmp.mr.mr_bigobject_31_inc { 0: 1 }
Sat Jun 02 00:33:51 [conn32] build index done.  scanned 0 total records. 0 secs
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_31
Sat Jun 02 00:33:51 [conn32] build index test.tmp.mr.mr_bigobject_31 { _id: 1 }
Sat Jun 02 00:33:51 [conn32] build index done.  scanned 0 total records. 0 secs
Sat Jun 02 00:33:51 [conn32] JS Error: Error: an emit can't be more than half max bson size (anon):1
Sat Jun 02 00:33:51 [conn32] mr failed, removing collection :: caused by :: 9014 map invoke failed: JS Error: Error: an emit can't be more than half max bson size (anon):1
Sat Jun 02 00:33:51 [journal] info journal _rotate called inside dbMutex - ok but should be somewhat rare
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_31
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Sat Jun 02 00:33:51 [conn32] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s + this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:161 312ms
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_32_inc
Sat Jun 02 00:33:51 [conn32] build index test.tmp.mr.mr_bigobject_32_inc { 0: 1 }
Sat Jun 02 00:33:51 [conn32] build index done.  scanned 0 total records. 0 secs
Sat Jun 02 00:33:51 [conn32] CMD: drop test.tmp.mr.mr_bigobject_32
Sat Jun 02 00:33:51 [conn32] build index test.tmp.mr.mr_bigobject_32 { _id: 1 }
Sat Jun 02 00:33:51 [conn32] build index done.  scanned 0 total records. 0 secs
Sat Jun 02 00:33:52 [journal] MapViewOfFileEx for /data/db/sconsTests/test.1 failed with error errno:487 Attempt to access invalid address. (file size is 134217728) in MemoryMappedFile::remapPrivateView
Sat Jun 02 00:33:52 [journal]   Fatal Assertion 16148
Sat Jun 02 00:33:52 [journal] 
 
***aborting after fassert() failure
 
 
Sat Jun 02 00:33:55 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:27999
Sat Jun 02 00:33:55 SocketException: remote: 127.0.0.1:27999 error: 9001 socket exception [1] server [127.0.0.1:27999] 
Sat Jun 02 00:33:55 DBClientCursor::init call() failed
Sat Jun 02 00:33:55 query failed : test.$cmd { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } to: 127.0.0.1:27999
Sat Jun 02 00:33:55 Error: error doing query: failed src/mongo/shell/collection.js:155
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js
                17956.000090ms
test D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js exited with status -3
354 tests succeeded
225 tests didn't get run
The following tests failed (with exit code):
D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js	-3
scons: building terminated because of errors.

Comment by Tad Marshall [ 01/Jun/12 ]

This time on a nightly, on jstests\tool\dumprestore10.js:

http://buildbot.mongodb.org/builders/Nightly%20Windows%2064-bit/builds/884/steps/test_2/logs/stdio

         Test : dumprestore10.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit_Nightly\mongo\mongo.exe --port 27999 --nodb D:\10gen\buildslaves\mongo\Windows_64bit_Nightly\mongo\jstests\tool\dumprestore10.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit_Nightly\\mongo\\jstests\\tool\\dumprestore10.js";TestData.testFile = "dumprestore10.js";TestData.testName = "dumprestore10";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Fri Jun  1 11:48:42 2012
MongoDB shell version: 2.1.2-pre-
null
 
dumprestore10.js step 1 
ReplSetTest Starting Set
ReplSetTest n is : 0
ReplSetTest n: 0 ports: [ 31000, 31001 ]	31000 number
{
	"useHostName" : undefined,
	"oplogSize" : 40,
	"keyFile" : undefined,
	"port" : 31000,
	"noprealloc" : "",
	"smallfiles" : "",
	"rest" : "",
	"replSet" : "dumprestore10",
	"dbpath" : "$set-$node",
	"restart" : undefined,
	"pathOpts" : {
		"node" : 0,
		"set" : "dumprestore10"
	}
}
ReplSetTest Starting....
Resetting db path '/data/db/dumprestore10-0'
Fri Jun 01 11:48:42 shell: started program mongod.exe --oplogSize 40 --port 31000 --noprealloc --smallfiles --rest --replSet dumprestore10 --dbpath /data/db/dumprestore10-0
 m31000| note: noprealloc may hurt performance in many applications
 m31000| Fri Jun 01 11:48:42 [initandlisten] MongoDB starting : pid=3544 port=31000 dbpath=/data/db/dumprestore10-0 64-bit host=ip-0A08A111
 m31000| Fri Jun 01 11:48:42 [initandlisten] 
 m31000| Fri Jun 01 11:48:42 [initandlisten] ** NOTE: This is a development version (2.1.2-pre-) of MongoDB.
 m31000| Fri Jun 01 11:48:42 [initandlisten] **       Not recommended for production.
 m31000| Fri Jun 01 11:48:42 [initandlisten] 
 m31000| Fri Jun 01 11:48:42 [initandlisten] db version v2.1.2-pre-, pdfile version 4.5
 m31000| Fri Jun 01 11:48:42 [initandlisten] git version: 32c4381ffa76c651c825bae3249f566493d12c8f
 m31000| Fri Jun 01 11:48:42 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
 m31000| Fri Jun 01 11:48:42 [initandlisten] options: { dbpath: "/data/db/dumprestore10-0", noprealloc: true, oplogSize: 40, port: 31000, replSet: "dumprestore10", rest: true, smallfiles: true }
 m31000| Fri Jun 01 11:48:42 [initandlisten] journal dir=/data/db/dumprestore10-0/journal
 m31000| Fri Jun 01 11:48:42 [initandlisten] recover : no journal files present, no recovery needed
 m31000| Fri Jun 01 11:48:42 [initandlisten] waiting for connections on port 31000
//...snip...
 m31000| Fri Jun 01 12:17:50 [journal] MapViewOfFileEx for /data/db/dumprestore10-0/foo.0 failed with error errno:487 Attempt to access invalid address. (file size is 16777216) in MemoryMappedFile::remapPrivateView
 m31000| Fri Jun 01 12:17:50 [journal]   Fatal Assertion 16148
 m31000| Fri Jun 01 12:17:50 [journal] 
 m31000| 
 m31000| ***aborting after fassert() failure
//... snip...
test D:\10gen\buildslaves\mongo\Windows_64bit_Nightly\mongo\jstests\tool\dumprestore10.js exited with status -3
7 tests succeeded
19 tests didn't get run
The following tests failed (with exit code):
D:\10gen\buildslaves\mongo\Windows_64bit_Nightly\mongo\jstests\tool\dumprestore10.js	-3
scons: building terminated because of errors.

Comment by Ian Whalen (Inactive) [ 28/May/12 ]

Possible latest manifestation of MapViewOfFileEx problems on Windows (not specifically related to import):

 m30200| Sun May 27 16:41:30 [conn1] JS Error: Error: an emit can't be more than half max bson size (anon):1
 m30200| Sun May 27 16:41:30 [conn1] mr failed, removing collection :: caused by :: 9014 map invoke failed: JS Error: Error: an emit can't be more than half max bson size (anon):1
 m30200| Sun May 27 16:41:30 [conn1] CMD: drop test.tmp.mr.mr_bigobject_32
 m30200| Sun May 27 16:41:30 [conn1] CMD: drop test.tmp.mr.mr_bigobject_32_inc
 m30200| Sun May 27 16:41:30 [conn1] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
 m30200|     emit(1, this.s + this.s);
 m30200| }, reduce: function (k, v) {
 m30200|     return 1;
 m30200| }, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:161 253ms
 m30200| Sun May 27 16:41:30 [conn1] CMD: drop test.tmp.mr.mr_bigobject_33_inc
 m30200| Sun May 27 16:41:30 [conn1] build index test.tmp.mr.mr_bigobject_33_inc { 0: 1 }
 m30200| Sun May 27 16:41:30 [conn1] build index done.  scanned 0 total records. 0 secs
 m30200| Sun May 27 16:41:30 [conn1] CMD: drop test.tmp.mr.mr_bigobject_33
 m30200| Sun May 27 16:41:30 [conn1] build index test.tmp.mr.mr_bigobject_33 { _id: 1 }
 m30200| Sun May 27 16:41:30 [conn1] build index done.  scanned 0 total records. 0 secs
 m30200| Sun May 27 16:41:31 [journal] MapViewOfFileEx for /data/db/dur_passthrough/test.1 failed with error errno:487 Attempt to access invalid address. (file size is 268435456) in MemoryMappedFile::remapPrivateView
 m30200| Sun May 27 16:41:31 [journal]   Fatal Assertion 16148
 m30200| Sun May 27 16:41:31 [journal] 
 m30200| 
 m30200| ***aborting after fassert() failure
 m30200| 
 m30200| 
Sun May 27 16:41:32 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:30200
Sun May 27 16:41:32 SocketException: remote: 127.0.0.1:30200 error: 9001 socket exception [1] server [127.0.0.1:30200] 
Sun May 27 16:41:32 DBClientCursor::init call() failed
Sun May 27 16:41:32 query failed : test.$cmd { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } to: 127.0.0.1:30200
Error: error doing query: failed
 

http://buildbot.mongodb.org/builders/Windows%2064-bit%20Weekly%20Slow%20Tests/builds/11/steps/test/logs/stdio

Comment by Tad Marshall [ 25/May/12 ]

Latest crash:

http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/4729/steps/test_6/logs/stdio

Thu May 24 19:53:26 [initandlisten] MongoDB starting : pid=5308 port=27999 dbpath=/data/db/sconsTests 64-bit host=ip-0A08A111
Thu May 24 19:53:26 [initandlisten] 
Thu May 24 19:53:26 [initandlisten] ** NOTE: This is a development version (2.1.2-pre-) of MongoDB.
Thu May 24 19:53:26 [initandlisten] **       Not recommended for production.
Thu May 24 19:53:26 [initandlisten] 
Thu May 24 19:53:26 [initandlisten] db version v2.1.2-pre-, pdfile version 4.5
Thu May 24 19:53:26 [initandlisten] git version: 582dda88547aa91626046230bbacd9e183359461
Thu May 24 19:53:26 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
Thu May 24 19:53:26 [initandlisten] options: { dbpath: "/data/db/sconsTests/", port: 27999 }
Thu May 24 19:53:26 [initandlisten] journal dir=/data/db/sconsTests/journal
Thu May 24 19:53:26 [initandlisten] recover : no journal files present, no recovery needed
Thu May 24 19:53:26 [websvr] admin web console waiting for connections on port 28999
Thu May 24 19:53:26 [initandlisten] waiting for connections on port 27999
// ... snip ...
         Test : mr_bigobject.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit\mongo\mongo.exe --port 27999 D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit\\mongo\\jstests\\mr_bigobject.js";TestData.testFile = "mr_bigobject.js";TestData.testName = "mr_bigobject";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Thu May 24 20:05:07 2012
Thu May 24 20:05:07 [conn21] end connection 127.0.0.1:59856 (0 connections now open)
MongoDB shell version: 2.1.2-pre-
connecting to: 127.0.0.1:27999/test
Thu May 24 20:05:07 [initandlisten] connection accepted from 127.0.0.1:59857 #22 (1 connection now open)
null
Thu May 24 20:05:07 [conn22] CMD: drop test.mr_bigobject
Thu May 24 20:05:21 [conn22] build index test.mr_bigobject { _id: 1 }
Thu May 24 20:05:21 [conn22] build index done.  scanned 0 total records. 0 secs
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Thu May 24 20:05:22 [conn22] build index test.tmp.mr.mr_bigobject_31_inc { 0: 1 }
Thu May 24 20:05:22 [conn22] build index done.  scanned 0 total records. 0 secs
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_31
Thu May 24 20:05:22 [conn22] build index test.tmp.mr.mr_bigobject_31 { _id: 1 }
Thu May 24 20:05:22 [conn22] build index done.  scanned 0 total records. 0 secs
Thu May 24 20:05:22 [conn22] JS Error: Error: an emit can't be more than half max bson size (anon):1
Thu May 24 20:05:22 [conn22] mr failed, removing collection :: caused by :: 9014 map invoke failed: JS Error: Error: an emit can't be more than half max bson size (anon):1
Thu May 24 20:05:22 [journal] info journal _rotate called inside dbMutex - ok but should be somewhat rare
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_31
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Thu May 24 20:05:22 [conn22] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s + this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:161 296ms
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_32_inc
Thu May 24 20:05:22 [conn22] build index test.tmp.mr.mr_bigobject_32_inc { 0: 1 }
Thu May 24 20:05:22 [conn22] build index done.  scanned 0 total records. 0 secs
Thu May 24 20:05:22 [conn22] CMD: drop test.tmp.mr.mr_bigobject_32
Thu May 24 20:05:22 [conn22] build index test.tmp.mr.mr_bigobject_32 { _id: 1 }
Thu May 24 20:05:22 [conn22] build index done.  scanned 0 total records. 0 secs
Thu May 24 20:05:23 [journal] MapViewOfFileEx for /data/db/sconsTests/test.1 failed with error errno:487 Attempt to access invalid address. (file size is 134217728) in MemoryMappedFile::remapPrivateView
Thu May 24 20:05:23 [journal]   Fatal Assertion 16148
Thu May 24 20:05:23 [journal] 
 
***aborting after fassert() failure
 
 
Thu May 24 20:05:25 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:27999
Thu May 24 20:05:25 SocketException: remote: 127.0.0.1:27999 error: 9001 socket exception [1] server [127.0.0.1:27999] 
Thu May 24 20:05:25 DBClientCursor::init call() failed
Thu May 24 20:05:25 query failed : test.$cmd { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } to: 127.0.0.1:27999
Thu May 24 20:05:25 Error: error doing query: failed src/mongo/shell/collection.js:155
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js
                18124.000072ms
test D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\mr_bigobject.js exited with status -3

Comment by Tad Marshall [ 25/May/12 ]

Still happening.

Comment by auto [ 19/May/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-5867 Use CRITICAL_SECTION for non-SRW Windows locks

Also related to SERVER-2942. The non-2008plus Windows builds was using
the Microsoft Concurrency Runtime for the SimpleMutex class, but this
interacts badly with the Windows memory hooking code added for SERVER-2942.
A call to VirtualAlloc during thread creation would take out the SimpleMutex
for RemapLock, which (through the Concurrency Runtime) make another call
into VirtualAlloc, deadlocking the thread. I changed SimpleMutex to use
the Win32 CRITICAL_SECTION APIs, which do not allocate memory when taking
the critical section.
Branch: master
https://github.com/mongodb/mongo/commit/686b2b375dfe233032e2eb80e8c9a0175db53ca2

Comment by auto [ 19/May/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-5867 Use CRITICAL_SECTION for non-SRW Windows locks

Also related to SERVER-2942. The non-2008plus Windows builds was using
the Microsoft Concurrency Runtime for the SimpleMutex class, but this
interacts badly with the Windows memory hooking code added for SERVER-2942.
A call to VirtualAlloc during thread creation would take out the SimpleMutex
for RemapLock, which (through the Concurrency Runtime) make another call
into VirtualAlloc, deadlocking the thread. I changed SimpleMutex to use
the Win32 CRITICAL_SECTION APIs, which do not allocate memory when taking
the critical section.
Branch: master
https://github.com/mongodb/mongo/commit/686b2b375dfe233032e2eb80e8c9a0175db53ca2

Comment by Tad Marshall [ 18/May/12 ]

Resolving this as "fixed" (again) pending more testing. We now lock out calls to RtlCreateHeap() and NtAllocateVirtualMemory() while we remap the private view; hopefully, this will prevent virtual memory allocations from happening while the virtual address range used by the private view is vulnerable. Will reopen if the MapViewOfFileEx() failure happens again.

Comment by auto [ 17/May/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-2942 Hook functions in the Windows Import Address Table

In the Windows version of mongod and test, hook the RtlCreateHeap and
NtAllocateVirtualMemory functions in ntdll.dll through the Import
Address Tables of kernel32.dll and/or kernelbase.dll. Use RemapLock
to block memory allocations during the period when remapPrivateView()
has the private view unmapped so that other threads can't grab any of
the address space required for the private view.
Branch: master
https://github.com/mongodb/mongo/commit/39206090720e914a5810ae02ae6f7c1212b59ca5

Comment by Tad Marshall [ 12/May/12 ]

http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/4698/steps/test_10/logs/stdio

 *******************************************
         Test : dbcase.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit\mongo\mongo.exe --port 27999 --nodb D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\repl\dbcase.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit\\mongo\\jstests\\repl\\dbcase.js";TestData.testFile = "dbcase.js";TestData.testName = "dbcase";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Sat May 12 05:27:09 2012
Sat May 12 05:27:09 [conn4] end connection 127.0.0.1:57875 (0 connections now open)
MongoDB shell version: 2.1.2-pre-
null
startMongod WARNING DELETES DATA DIRECTORY THIS IS FOR TESTING ONLY
Sat May 12 05:27:09 shell: started program mongod.exe --nohttpinterface --noprealloc --bind_ip 127.0.0.1 --smallfiles --port 31000 --dbpath /data/db/jstests_repl_dbcase-master --master --oplogSize 40
 m31000| db level locking enabled: 1
 m31000| note: noprealloc may hurt performance in many applications
 m31000| Sat May 12 05:27:09 [initandlisten] MongoDB starting : pid=5224 port=31000 dbpath=/data/db/jstests_repl_dbcase-master master=1 64-bit host=ip-0A08A111
 m31000| Sat May 12 05:27:09 [initandlisten] 
 m31000| Sat May 12 05:27:09 [initandlisten] ** NOTE: This is a development version (2.1.2-pre-) of MongoDB.
 m31000| Sat May 12 05:27:09 [initandlisten] **       Not recommended for production.
 m31000| Sat May 12 05:27:09 [initandlisten] 
 m31000| Sat May 12 05:27:09 [initandlisten] db version v2.1.2-pre-, pdfile version 4.5
 m31000| Sat May 12 05:27:09 [initandlisten] git version: 80ff8f73f9c0da7fb0636a48c3c2562eed01aba4
 m31000| Sat May 12 05:27:09 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
 m31000| Sat May 12 05:27:09 [initandlisten] options: { bind_ip: "127.0.0.1", dbpath: "/data/db/jstests_repl_dbcase-master", master: true, nohttpinterface: true, noprealloc: true, oplogSize: 40, port: 31000, smallfiles: true }
 m31000| Sat May 12 05:27:09 [initandlisten] journal dir=/data/db/jstests_repl_dbcase-master/journal
 m31000| Sat May 12 05:27:09 [initandlisten] recover : no journal files present, no recovery needed
 m31000| Sat May 12 05:27:10 [initandlisten] ******
 m31000| Sat May 12 05:27:10 [initandlisten] creating replication oplog of size: 40MB...
 m31000| Sat May 12 05:27:10 [initandlisten] ******
 m31000| Sat May 12 05:27:10 [initandlisten] waiting for connections on port 31000
 m31000| Sat May 12 05:27:10 [initandlisten] connection accepted from 127.0.0.1:57876 #1 (1 connection now open)
// ... snip
 m31000| Sat May 12 05:27:13 [initandlisten] connection accepted from 127.0.0.1:57881 #5 (3 connections now open)
 m31000| Sat May 12 05:27:13 [conn5] warning database /data/db/jstests_repl_dbcase-master dbname could not be opened
 m31000| Sat May 12 05:27:13 [conn5] DBException 13297: db already exists with different case other: [dbNAme] me [dbname]
 m31000| Sat May 12 05:27:13 [conn5] assertion 13297 db already exists with different case other: [dbNAme] me [dbname] ns:dbname.system.namespaces query:{}
 m31000| Sat May 12 05:27:13 [conn5] { $err: "db already exists with different case other: [dbNAme] me [dbname]", code: 13297 }
 m31001| Sat May 12 05:27:13 [replslave] resync: dropping database dbname
 m31000| Sat May 12 05:27:13 [conn5] end connection 127.0.0.1:57881 (2 connections now open)
 m31001| Sat May 12 05:27:13 [replslave] removeJournalFiles
 m31001| Sat May 12 05:27:13 [replslave] resync: database dbname not valid on the master due to a name conflict, dropping.
 m31001| Sat May 12 05:27:13 [replslave] repl:   applied 1 operations
 m31001| Sat May 12 05:27:13 [replslave] repl:  end sync_pullOpLog syncedTo: May 12 05:27:13 4fae2cf1:3
 m31001| Sat May 12 05:27:13 [replslave] repl: syncing from host:127.0.0.1:31000
 m31001| Sat May 12 05:27:13 [replslave] An earlier initial clone of 'dbNAme' did not complete, now resyncing.
 m31001| Sat May 12 05:27:13 [replslave] resync: dropping database dbNAme
 m31001| Sat May 12 05:27:13 [replslave] removeJournalFiles
 m31001| Sat May 12 05:27:13 [replslave] resync: cloning database dbNAme to get an initial copy
 m31001| Sat May 12 05:27:13 [journal] MapViewOfFileEx for /data/db/jstests_repl_dbcase-slave/local.0 failed with error errno:487 Attempt to access invalid address. (file size is 16777216) in MemoryMappedFile::remapPrivateView
 m31001| Sat May 12 05:27:13 [journal]   Fatal Assertion 16148
 m31001| Sat May 12 05:27:13 [journal] 
 m31001| 
 m31001| ***aborting after fassert() failure
// ... snip
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\repl\dbcase.js
 m31000| Sat May 12 05:27:44 [initandlisten] connection accepted from 127.0.0.1:57899 #7 (2 connections now open)
 m31000| Sat May 12 05:27:44 [conn7] terminating, shutdown command received
 m31000| Sat May 12 05:27:44 dbexit: shutdown called
 m31000| Sat May 12 05:27:44 [conn7] shutdown: going to close listening sockets...
 m31000| Sat May 12 05:27:44 [conn7] closing listening socket: 360
 m31000| Sat May 12 05:27:44 [conn7] shutdown: going to flush diaglog...
 m31000| Sat May 12 05:27:44 [conn7] shutdown: going to close sockets...
 m31000| Sat May 12 05:27:44 [conn7] shutdown: waiting for fs preallocator...
 m31000| Sat May 12 05:27:44 [conn7] shutdown: lock for final commit...
 m31000| Sat May 12 05:27:44 [conn7] shutdown: final commit...
Sat May 12 05:27:44 DBClientCursor::init call() failed
 m31000| Sat May 12 05:27:44 [conn1] end connection 127.0.0.1:57876 (1 connection now open)
 m31000| Sat May 12 05:27:44 [conn7] info journal _rotate called inside dbMutex - ok but should be somewhat rare
 m31000| Sat May 12 05:27:44 [conn7] shutdown: closing all files...
 m31000| Sat May 12 05:27:44 [conn7] closeAllFiles() finished
 m31000| Sat May 12 05:27:44 [conn7] journalCleanup...
 m31000| Sat May 12 05:27:44 [conn7] removeJournalFiles
 m31000| Sat May 12 05:27:44 [conn7] shutdown: removing fs lock...
 m31000| Sat May 12 05:27:44 dbexit: really exiting now
                36368.000031ms
test D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\repl\dbcase.js exited with status -3
3 tests succeeded
28 tests didn't get run
The following tests failed (with exit code):
D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\repl\dbcase.js	-3
scons: building terminated because of errors.

Comment by Tad Marshall [ 08/May/12 ]

Buildbot hit it again ...

http://buildbot.mongodb.org/builders/Nightly%20Windows%2064-bit%202008%2B/builds/140/steps/test_1/logs/stdio

Tue May 08 05:09:02 [initandlisten] MongoDB starting : pid=2348 port=27999 dbpath=/data/db/sconsTests 64-bit host=ip-0A08A111
Tue May 08 05:09:02 [initandlisten] 
Tue May 08 05:09:02 [initandlisten] ** NOTE: This is a development version (2.1.2-pre-) of MongoDB.
Tue May 08 05:09:02 [initandlisten] **       Not recommended for production.
Tue May 08 05:09:02 [initandlisten] 
Tue May 08 05:09:02 [initandlisten] db version v2.1.2-pre-, pdfile version 4.5
Tue May 08 05:09:02 [initandlisten] git version: 33730732395c1e6eeb9eb3468593fd69ee3e3f93
Tue May 08 05:09:02 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
Tue May 08 05:09:02 [initandlisten] options: { dbpath: "/data/db/sconsTests/", port: 27999 }
Tue May 08 05:09:02 [initandlisten] journal dir=/data/db/sconsTests/journal
Tue May 08 05:09:02 [initandlisten] recover : no journal files present, no recovery needed
db level locking enabled: 1
... // snip
 *******************************************
         Test : mr_bigobject.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\mongo.exe --port 27999 D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit_2008+_Nightly\\mongo\\jstests\\mr_bigobject.js";TestData.testFile = "mr_bigobject.js";TestData.testName = "mr_bigobject";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Tue May  8 05:25:16 2012
Tue May 08 05:25:16 [conn17] end connection 127.0.0.1:50968 (0 connections now open)
MongoDB shell version: 2.1.2-pre-
connecting to: 127.0.0.1:27999/test
Tue May 08 05:25:16 [initandlisten] connection accepted from 127.0.0.1:50969 #18 (1 connection now open)
null
Tue May 08 05:25:16 [conn18] CMD: drop test.mr_bigobject
Tue May 08 05:25:30 [conn18] build index test.mr_bigobject { _id: 1 }
Tue May 08 05:25:30 [conn18] build index done.  scanned 0 total records. 0 secs
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Tue May 08 05:25:31 [conn18] build index test.tmp.mr.mr_bigobject_31_inc { 0: 1 }
Tue May 08 05:25:31 [conn18] build index done.  scanned 0 total records. 0 secs
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_31
Tue May 08 05:25:31 [conn18] build index test.tmp.mr.mr_bigobject_31 { _id: 1 }
Tue May 08 05:25:31 [conn18] build index done.  scanned 0 total records. 0 secs
Tue May 08 05:25:31 [conn18] JS Error: Error: an emit can't be more than half max bson size (anon):1
Tue May 08 05:25:31 [conn18] mr failed, removing collection :: caused by :: 9014 map invoke failed: JS Error: Error: an emit can't be more than half max bson size (anon):1
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_31
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_31_inc
Tue May 08 05:25:31 [conn18] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s + this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:161 277ms
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_32_inc
Tue May 08 05:25:31 [conn18] build index test.tmp.mr.mr_bigobject_32_inc { 0: 1 }
Tue May 08 05:25:31 [conn18] build index done.  scanned 0 total records. 0 secs
Tue May 08 05:25:31 [conn18] CMD: drop test.tmp.mr.mr_bigobject_32
Tue May 08 05:25:31 [conn18] build index test.tmp.mr.mr_bigobject_32 { _id: 1 }
Tue May 08 05:25:31 [conn18] build index done.  scanned 0 total records. 0 secs
Tue May 08 05:25:32 [journal] MapViewOfFileEx for /data/db/sconsTests/test.1 failed with error errno:487 Attempt to access invalid address. (file size is 134217728) in MemoryMappedFile::remapPrivateView
Tue May 08 05:25:32 [journal]   Fatal Assertion 16148
Tue May 08 05:25:32 [journal] 
 
***aborting after fassert() failure
 
 
Tue May 08 05:25:33 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:27999
Tue May 08 05:25:33 SocketException: remote: 127.0.0.1:27999 error: 9001 socket exception [1] server [127.0.0.1:27999] 
Tue May 08 05:25:33 DBClientCursor::init call() failed
Tue May 08 05:25:33 query failed : test.$cmd { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } to: 127.0.0.1:27999
Tue May 08 05:25:33 Error: error doing query: failed src/mongo/shell/collection.js:155
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js
                16787.000179ms
test D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js exited with status -3
348 tests succeeded
222 tests didn't get run
The following tests failed (with exit code):
D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js	-3
scons: building terminated because of errors.

Comment by Tad Marshall [ 08/May/12 ]

Resolved too soon, hit it again.

Comment by Tad Marshall [ 07/May/12 ]

It appears that Eric fixed this issue with his commit for SERVER-5680 ... https://github.com/mongodb/mongo/commit/8b099a96ac9b5a1f73954204806ef8e0c77d481a .

We were getting MapViewOfFileEx() failures pretty much every day in Buildbot prior to this commit, and we have not had a single MapViewOfFileEx() failure since the commit. We have a code change we could make, but it seems more prudent to not make additional changes if the problem is already fixed.

I'll mark this resolved for now since it seems to be ... we can reopen it if the problem shows up again.

Comment by Tad Marshall [ 07/May/12 ]

Access violations on Windows were traced to the time window while the data file is unmapped in remapPrivateView, so these issues are related.

Comment by Tad Marshall [ 28/Apr/12 ]

Another ...

http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/4667/steps/test_2/logs/stdio

 *******************************************
         Test : repair5.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit\mongo\mongo.exe --port 27999 D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\disk\repair5.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit\\mongo\\jstests\\disk\\repair5.js";TestData.testFile = "repair5.js";TestData.testName = "repair5";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Fri Apr 27 15:51:50 2012
Fri Apr 27 15:51:50 [conn35] end connection 127.0.0.1:53344 (0 connections now open)
MongoDB shell version: 2.1.1-pre-
connecting to: 127.0.0.1:27999/test
Fri Apr 27 15:51:50 [initandlisten] connection accepted from 127.0.0.1:53345 #36 (1 connection now open)
null
Fri Apr 27 15:51:50 shell: started program mongod.exe --port 31000 --dbpath /data/db/jstests_disk_repair5/ --repairpath /data/db/jstests_disk_repair5/repairDir/ --nohttpinterface --bind_ip 127.0.0.1
 m31000| db level locking enabled: 1
 m31000| Fri Apr 27 15:51:50 [initandlisten] MongoDB starting : pid=2024 port=31000 dbpath=/data/db/jstests_disk_repair5 64-bit host=ip-0A08A111
 m31000| Fri Apr 27 15:51:50 [initandlisten] 
 m31000| Fri Apr 27 15:51:50 [initandlisten] ** NOTE: This is a development version (2.1.1-pre-) of MongoDB.
 m31000| Fri Apr 27 15:51:50 [initandlisten] **       Not recommended for production.
 m31000| Fri Apr 27 15:51:50 [initandlisten] 
 m31000| Fri Apr 27 15:51:50 [initandlisten] db version v2.1.1-pre-, pdfile version 4.5
 m31000| Fri Apr 27 15:51:50 [initandlisten] git version: 558966d2b05283edb37622aeddc3712b8234b5ae
 m31000| Fri Apr 27 15:51:50 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
 m31000| Fri Apr 27 15:51:50 [initandlisten] options: { bind_ip: "127.0.0.1", dbpath: "/data/db/jstests_disk_repair5/", nohttpinterface: true, port: 31000, repairpath: "/data/db/jstests_disk_repair5/repairDir/" }
 m31000| Fri Apr 27 15:51:50 [initandlisten] journal dir=/data/db/jstests_disk_repair5/journal
 m31000| Fri Apr 27 15:51:50 [initandlisten] recover : no journal files present, no recovery needed
 m31000| Fri Apr 27 15:51:50 [initandlisten] waiting for connections on port 31000
 m31000| Fri Apr 27 15:51:51 [initandlisten] connection accepted from 127.0.0.1:53346 #1 (1 connection now open)
 m31000| Fri Apr 27 15:51:51 [conn1] build index jstests_disk_repair5.jstests_disk_repair5 { _id: 1 }
 m31000| Fri Apr 27 15:51:51 [conn1] build index done.  scanned 0 total records. 0 secs
 m31000| Fri Apr 27 15:51:51 [journal] info journal _rotate called inside dbMutex - ok but should be somewhat rare
Fri Apr 27 15:51:54 shell: started program mongo.exe --eval TestData = {
	"testPath" : "D:\\10gen\\buildslaves\\mongo\\Windows_64bit\\mongo\\jstests\\disk\\repair5.js",
	"testFile" : "repair5.js",
	"testName" : "repair5",
	"noJournal" : false,
	"noJournalPrealloc" : false,
	"auth" : false,
	"keyFile" : null,
	"keyFileData" : null
};jsTest.authenticate(db.getMongo());function killRepair() {
    while (1) {
        p = db.currentOp().inprog;
        for (var i in p) {
            var o = p[i];
            printjson(o);
            if (o.active && o.query.repairDatabase) {
                db.killOp(o.opid);
                return;
            }
        }
    }
}; killRepair(); 127.0.0.1:31000
sh2344| MongoDB shell version: 2.1.1-pre-
sh2344| connecting to: 127.0.0.1:31000/test
 m31000| Fri Apr 27 15:51:54 [initandlisten] connection accepted from 127.0.0.1:53347 #2 (2 connections now open)
 m31000| Fri Apr 27 15:51:54 [conn1] repairDatabase jstests_disk_repair5
 m31000| Fri Apr 27 15:51:54 [conn1]  jstests_disk_repair5 repairDatabase jstests_disk_repair5
sh2344| {
sh2344| 	"opid" : 20001,
sh2344| 	"active" : true,
sh2344| 	"lockType" : "W",
sh2344| 	"waitingForLock" : false,
sh2344| 	"secs_running" : 0,
sh2344| 	"op" : "query",
sh2344| 	"ns" : "jstests_disk_repair5",
sh2344| 	"query" : {
sh2344| 		"repairDatabase" : 1,
sh2344| 		"backupOriginalFiles" : true
sh2344| 	},
sh2344| 	"client" : "127.0.0.1:53346",
sh2344| 	"desc" : "conn",
sh2344| 	"connectionId" : 1,
sh2344| 	"locks" : {
sh2344| 		"." : "W"
sh2344| 	},
sh2344| 	"numYields" : 0
sh2344| }
 m31000| Fri Apr 27 15:51:54 [conn2] going to kill op: op: 20001.0
 m31000| Fri Apr 27 15:51:54 [conn2] end connection 127.0.0.1:53347 (1 connection now open)
 m31000| Fri Apr 27 15:51:58 [conn1] removeJournalFiles
 m31000| Fri Apr 27 15:51:58 [conn1] command jstests_disk_repair5.$cmd command: { repairDatabase: 1.0, backupOriginalFiles: true } ntoreturn:1 keyUpdates:0 reslen:96 3091ms
 m31000| Fri Apr 27 15:51:58 [conn1] *** unhandled exception (access violation) at 0x000000013FC23C2C, terminating
 m31000| Fri Apr 27 15:51:58 [conn1] *** access violation was a read from 0x0000000018C61000
 m31000| Fri Apr 27 15:51:58 [conn1] writing minidump dignostic file 00000001401024A8
 m31000| Fri Apr 27 15:51:58 [journal] MapViewOfFileEx for /data/db/jstests_disk_repair5/jstests_disk_repair5.1 failed with error errno:487 Attempt to access invalid address. (file size is 134217728) in MemoryMappedFile::remapPrivateView
 m31000| Fri Apr 27 15:51:58 [journal]   Fatal Assertion 16148
 m31000| Fri Apr 27 15:51:58 [journal] 
 m31000| 
 m31000| ***aborting after fassert() failure
 m31000| 
 m31000| 
 m31000| Fri Apr 27 15:51:58 dbexit: unhandled exception
 m31000| Fri Apr 27 15:51:58 [conn1] shutdown: going to close listening sockets...
 m31000| Fri Apr 27 15:51:58 [conn1] closing listening socket: 340
 m31000| Fri Apr 27 15:51:58 [conn1] shutdown: going to flush diaglog...
 m31000| Fri Apr 27 15:51:58 [conn1] shutdown: going to close sockets...
 m31000| Fri Apr 27 15:51:58 [conn1] shutdown: waiting for fs preallocator...
 m31000| Fri Apr 27 15:51:58 [conn1] shutdown: lock for final commit...
Fri Apr 27 15:51:58 Error: dbclient error communicating with server: 127.0.0.1:31000 src/mongo/shell/query.js:113
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\disk\repair5.js
Fri Apr 27 15:52:00 [clientcursormon] mem (MB) res:25 virt:88 mapped:0
Fri Apr 27 15:52:00 [conn36] end connection 127.0.0.1:53345 (0 connections now open)
                11304.999828ms
test D:\10gen\buildslaves\mongo\Windows_64bit\mongo\jstests\disk\repair5.js exited with status -3

Comment by Tad Marshall [ 27/Apr/12 ]

Another case to use for debugging ...

http://buildbot.mongodb.org/builders/Nightly%20Windows%2064-bit%202008%2B/builds/128/steps/test_1/logs/stdio

c:\Python27\Scripts\..\python.exe D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\buildscripts\smoke.py test
cwd [D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo]
db level locking enabled: 1
Fri Apr 27 02:01:19 [initandlisten] MongoDB starting : pid=728 port=27999 dbpath=/data/db/sconsTests 64-bit host=ip-0A08A111
Fri Apr 27 02:01:19 [initandlisten] 
Fri Apr 27 02:01:19 [initandlisten] ** NOTE: This is a development version (2.1.1-pre-) of MongoDB.
Fri Apr 27 02:01:19 [initandlisten] **       Not recommended for production.
Fri Apr 27 02:01:19 [initandlisten] 
Fri Apr 27 02:01:19 [initandlisten] db version v2.1.1-pre-, pdfile version 4.5
Fri Apr 27 02:01:19 [initandlisten] git version: 3a0087aef01559874dc03a346b96e3d5592287e5
Fri Apr 27 02:01:19 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
Fri Apr 27 02:01:19 [initandlisten] options: { dbpath: "/data/db/sconsTests/", port: 27999 }
Fri Apr 27 02:01:19 [initandlisten] journal dir=/data/db/sconsTests/journal
Fri Apr 27 02:01:19 [initandlisten] recover : no journal files present, no recovery needed
Fri Apr 27 02:01:19 [websvr] admin web console waiting for connections on port 28999
Fri Apr 27 02:01:19 [initandlisten] waiting for connections on port 27999
Fri Apr 27 02:01:20 [initandlisten] connection accepted from 127.0.0.1:56769 #1 (1 connection now open)
running D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\mongod.exe --port 27999 --dbpath /data/db/sconsTests/
...
 *******************************************
         Test : mr_bigobject.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\mongo.exe --port 27999 D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit_2008+_Nightly\\mongo\\jstests\\mr_bigobject.js";TestData.testFile = "mr_bigobject.js";TestData.testName = "mr_bigobject";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Fri Apr 27 02:15:51 2012
Fri Apr 27 02:15:51 [conn759] end connection 127.0.0.1:57549 (0 connections now open)
MongoDB shell version: 2.1.1-pre-
connecting to: 127.0.0.1:27999/test
Fri Apr 27 02:15:51 [initandlisten] connection accepted from 127.0.0.1:57550 #760 (1 connection now open)
null
Fri Apr 27 02:15:51 [conn760] CMD: drop test.mr_bigobject
Fri Apr 27 02:16:05 [conn760] build index test.mr_bigobject { _id: 1 }
Fri Apr 27 02:16:05 [conn760] build index done.  scanned 0 total records. 0.001 secs
Fri Apr 27 02:16:05 [conn760] insert test.mr_bigobject keyUpdates:0 191ms
Fri Apr 27 02:16:06 [conn760] insert test.mr_bigobject keyUpdates:0 183ms
Fri Apr 27 02:16:06 [conn760] insert test.mr_bigobject keyUpdates:0 189ms
Fri Apr 27 02:16:06 [conn760] insert test.mr_bigobject keyUpdates:0 183ms
Fri Apr 27 02:16:06 [conn760] insert test.mr_bigobject keyUpdates:0 183ms
Fri Apr 27 02:16:06 [conn760] CMD: drop test.tmp.mr.mr_bigobject_32_inc
Fri Apr 27 02:16:06 [conn760] build index test.tmp.mr.mr_bigobject_32_inc { 0: 1 }
Fri Apr 27 02:16:06 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:06 [conn760] CMD: drop test.tmp.mr.mr_bigobject_32
Fri Apr 27 02:16:06 [conn760] build index test.tmp.mr.mr_bigobject_32 { _id: 1 }
Fri Apr 27 02:16:06 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:07 [conn760] JS Error: Error: an emit can't be more than half max bson size (anon):1
Fri Apr 27 02:16:07 [conn760] mr failed, removing collection :: caused by :: 9014 map invoke failed: JS Error: Error: an emit can't be more than half max bson size (anon):1
Fri Apr 27 02:16:07 [conn760] CMD: drop test.tmp.mr.mr_bigobject_32
Fri Apr 27 02:16:07 [conn760] CMD: drop test.tmp.mr.mr_bigobject_32_inc
Fri Apr 27 02:16:07 [conn760] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s + this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:161 279ms
Fri Apr 27 02:16:07 [conn760] CMD: drop test.tmp.mr.mr_bigobject_33_inc
Fri Apr 27 02:16:07 [conn760] build index test.tmp.mr.mr_bigobject_33_inc { 0: 1 }
Fri Apr 27 02:16:07 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:07 [conn760] CMD: drop test.tmp.mr.mr_bigobject_33
Fri Apr 27 02:16:07 [conn760] build index test.tmp.mr.mr_bigobject_33 { _id: 1 }
Fri Apr 27 02:16:07 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:08 [conn760] CMD: drop test.mr_bigobject_out
Fri Apr 27 02:16:08 [conn760] CMD: drop test.tmp.mr.mr_bigobject_33
Fri Apr 27 02:16:08 [conn760] CMD: drop test.tmp.mr.mr_bigobject_33
Fri Apr 27 02:16:08 [conn760] CMD: drop test.tmp.mr.mr_bigobject_33_inc
Fri Apr 27 02:16:08 [conn760] command test.$cmd command: { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    return 1;
}, out: "mr_bigobject_out" } ntoreturn:1 keyUpdates:0 reslen:140 1338ms
Fri Apr 27 02:16:08 [conn760] CMD: drop test.tmp.mr.mr_bigobject_34_inc
Fri Apr 27 02:16:08 [conn760] build index test.tmp.mr.mr_bigobject_34_inc { 0: 1 }
Fri Apr 27 02:16:08 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:08 [conn760] CMD: drop test.tmp.mr.mr_bigobject_34
Fri Apr 27 02:16:08 [conn760] build index test.tmp.mr.mr_bigobject_34 { _id: 1 }
Fri Apr 27 02:16:08 [conn760] build index done.  scanned 0 total records. 0 secs
Fri Apr 27 02:16:09 [journal] MapViewOfFileEx for /data/db/sconsTests/test.2 failed with error errno:487 Attempt to access invalid address. (file size is 268435456) in MemoryMappedFile::remapPrivateView
Fri Apr 27 02:16:09 [journal]   Fatal Assertion 16148
Fri Apr 27 02:16:09 [journal] 
 
***aborting after fassert() failure
 
 
Fri Apr 27 02:16:10 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:27999
Fri Apr 27 02:16:10 SocketException: remote: 127.0.0.1:27999 error: 9001 socket exception [1] server [127.0.0.1:27999] 
Fri Apr 27 02:16:10 DBClientCursor::init call() failed
Fri Apr 27 02:16:10 query failed : test.$cmd { mapreduce: "mr_bigobject", map: function () {
    emit(1, this.s);
}, reduce: function (k, v) {
    total = 0;
    for (var i = 0; i < v.length; i++..., out: "mr_bigobject_out" } to: 127.0.0.1:27999
Fri Apr 27 02:16:10 Error: error doing query: failed src/mongo/shell/collection.js:155
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js
                18950.000048ms
test D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js exited with status -3
348 tests succeeded
221 tests didn't get run
The following tests failed (with exit code):
D:\10gen\buildslaves\mongo\Windows_64bit_2008+_Nightly\mongo\jstests\mr_bigobject.js	-3

Comment by Tad Marshall [ 26/Apr/12 ]

Just building a set of known occurrences to help debugging ...

http://buildbot.mongodb.org/builders/Windows%2064-bit%202008%2B/builds/288/steps/test_9/logs/stdio

 *******************************************
         Test : jumbo1.js ...
      Command : D:\10gen\buildslaves\mongo\Windows_64bit_2008+\mongo\mongo.exe --port 27999 --nodb D:\10gen\buildslaves\mongo\Windows_64bit_2008+\mongo\jstests\sharding\jumbo1.js --eval TestData = new Object();TestData.testPath = "D:\\10gen\\buildslaves\\mongo\\Windows_64bit_2008+\\mongo\\jstests\\sharding\\jumbo1.js";TestData.testFile = "jumbo1.js";TestData.testName = "jumbo1";TestData.noJournal = false;TestData.noJournalPrealloc = false;TestData.auth = false;TestData.keyFile = null;TestData.keyFileData = null;
         Date : Thu Apr 26 06:55:06 2012
MongoDB shell version: 2.1.1-pre-
null
Resetting db path '/data/db/jump10'
Thu Apr 26 06:55:07 shell: started program mongod.exe --port 30000 --dbpath /data/db/jump10
 m30000| db level locking enabled: 1
 m30000| Thu Apr 26 06:55:07 [initandlisten] MongoDB starting : pid=3308 port=30000 dbpath=/data/db/jump10 64-bit host=ip-0A08A111
 m30000| Thu Apr 26 06:55:07 [initandlisten] 
 m30000| Thu Apr 26 06:55:07 [initandlisten] ** NOTE: This is a development version (2.1.1-pre-) of MongoDB.
 m30000| Thu Apr 26 06:55:07 [initandlisten] **       Not recommended for production.
 m30000| Thu Apr 26 06:55:07 [initandlisten] 
 m30000| Thu Apr 26 06:55:07 [initandlisten] db version v2.1.1-pre-, pdfile version 4.5
 m30000| Thu Apr 26 06:55:07 [initandlisten] git version: 3a0087aef01559874dc03a346b96e3d5592287e5
 m30000| Thu Apr 26 06:55:07 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
 m30000| Thu Apr 26 06:55:07 [initandlisten] options: { dbpath: "/data/db/jump10", port: 30000 }
 m30000| Thu Apr 26 06:55:07 [initandlisten] journal dir=/data/db/jump10/journal
 m30000| Thu Apr 26 06:55:07 [initandlisten] recover : no journal files present, no recovery needed
 m30000| Thu Apr 26 06:55:07 [initandlisten] waiting for connections on port 30000
... // test takes 29 seconds to hit MapViewOfFileEx failure ...
 m30999| Thu Apr 26 06:55:36 [Balancer] *** start balancing round
 m30999| Thu Apr 26 06:55:36 [Balancer] collection : test.foo
 m30999| Thu Apr 26 06:55:36 [Balancer] donor      : 16 chunks on shard0001
 m30999| Thu Apr 26 06:55:36 [Balancer] receiver   : 4 chunks on shard0000
 m30999| Thu Apr 26 06:55:36 [Balancer] chose [shard0001] to [shard0000] { _id: "test.foo-x_284.0", lastmod: Timestamp 5000|1, ns: "test.foo", min: { x: 284.0 }, max: { x: 396.0 }, shard: "shard0001" }
 m30999| Thu Apr 26 06:55:36 [Balancer] moving chunk ns: test.foo moving ( ns:test.foo at: shard0001:localhost:30001 lastmod: 5|1 min: { x: 284.0 } max: { x: 396.0 }) shard0001:localhost:30001 -> shard0000:localhost:30000
 m30001| Thu Apr 26 06:55:36 [conn4] received moveChunk request: { moveChunk: "test.foo", from: "localhost:30001", to: "localhost:30000", fromShard: "shard0001", toShard: "shard0000", min: { x: 284.0 }, max: { x: 396.0 }, maxChunkSizeBytes: 1048576, shardId: "test.foo-x_284.0", configdb: "localhost:30000" }
 m30001| Thu Apr 26 06:55:36 [conn4] created new distributed lock for test.foo on localhost:30000 ( lock timeout : 900000, ping interval : 30000, process : 0 )
 m30001| Thu Apr 26 06:55:36 [conn4] distributed lock 'test.foo/ip-0A08A111:30001:1335437708:26500' acquired, ts : 4f9929a8e8fa14ab2a2455a3
 m30001| Thu Apr 26 06:55:36 [conn4] about to log metadata event: { _id: "ip-0A08A111-2012-04-26T10:55:36-31", server: "ip-0A08A111", clientAddr: "127.0.0.1:52045", time: new Date(1335437736092), what: "moveChunk.start", ns: "test.foo", details: { min: { x: 284.0 }, max: { x: 396.0 }, from: "shard0001", to: "shard0000" } }
 m30001| Thu Apr 26 06:55:36 [conn4] moveChunk request accepted at version 5|1
 m30001| Thu Apr 26 06:55:36 [conn4] moveChunk number of documents: 112
 m30000| Thu Apr 26 06:55:36 [journal] MapViewOfFileEx for /data/db/jump10/config.0 failed with error errno:487 Attempt to access invalid address. (file size is 67108864) in MemoryMappedFile::remapPrivateView
 m30000| Thu Apr 26 06:55:36 [journal]   Fatal Assertion 16148
 m30000| Thu Apr 26 06:55:36 [journal] 
 m30000| 
 m30000| ***aborting after fassert() failure
...
test D:\10gen\buildslaves\mongo\Windows_64bit_2008+\mongo\jstests\sharding\jumbo1.js exited with status -3
 

Comment by Tad Marshall [ 24/Apr/12 ]

Another occurrence:

http://buildbot.mongodb.org:8081/builders/Windows%2064-bit%20Weekly%20Slow%20Tests/builds/5/steps/test/logs/stdio

*******************************************
         Test : jstests/find9.js ...
 m30200| Tue Apr 24 12:40:02 [conn1] CMD: drop test.jstests_find9
 m30200| Tue Apr 24 12:40:02 [journal] MapViewOfFileEx for /data/db/dur_passthrough/test.2 failed with error errno:487 Attempt to access invalid address. (file size is 268435456) in MemoryMappedFile::remapPrivateView
 m30200| Tue Apr 24 12:40:02 [journal]   Fatal Assertion 16148
 m30200| Tue Apr 24 12:40:02 [journal] 
 m30200| 
 m30200| ***aborting after fassert() failure
 m30200| 
 m30200| 
Tue Apr 24 12:40:03 Socket say send() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:30200
Error: error doing insert:socket exception
Tue Apr 24 12:40:03 Error: error loading js file: jstests/find9.js D:\10gen\buildslaves\mongo\Windows_64bit_Weekly_Slow_Tests\mongo\jstests\slowWeekly\dur_passthrough.js:32
failed to load: D:\10gen\buildslaves\mongo\Windows_64bit_Weekly_Slow_Tests\mongo\jstests\slowWeekly\dur_passthrough.js

Comment by Tad Marshall [ 12/Apr/12 ]

http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/4639/steps/test_9/logs/stdio

 m31200| Thu Apr 12 11:31:38 [conn36]  authenticate: { authenticate: 1, nonce: "256cdcc58093887b", user: "__system", key: "3ef8084a57dc6cb9977d9b2b487e649d" }
 m31100| Thu Apr 12 11:31:38 [initandlisten] connection accepted from 10.8.161.17:63020 #39 (21 connections now open)
 m31102| Thu Apr 12 11:31:38 [rsSync] CMD: drop test.tmp.mr.foo_0_inc
 m31102| Thu Apr 12 11:31:38 [rsSync] build index test.tmp.mr.foo_1 { _id: 1 }
 m31102| Thu Apr 12 11:31:38 [rsSync] build index done.  scanned 0 total records. 0 secs
 m31100| Thu Apr 12 11:31:38 [conn39]  authenticate: { authenticate: 1, nonce: "a122ecbc54861e52", user: "__system", key: "37daae12a23f9f00df069b9bb83896fe" }
 m31100| Thu Apr 12 11:31:38 [journal] MapViewOfFileEx for /data/db/d1-0/test.1 failed with error errno:487 Attempt to access invalid address.
 m31100| Thu Apr 12 11:31:38 [journal]   Fatal Assertion 16148
 m31100| Thu Apr 12 11:31:38 [journal] 
 m31100| 
 m31100| ***aborting after fassert() failure
 m31100| 
 m31100| 

Comment by Tad Marshall [ 12/Apr/12 ]

After the latest code changes, we still hit this.

Comment by auto [ 10/Apr/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-2942 Windows remapPrivateView() should unmap & remap

Change remapPrivateView() for Windows to use UnMapViewOfFile()
and MapViewOfFileEx() along with a mutex to reset the memory
map of the private view.
Branch: master
https://github.com/mongodb/mongo/commit/6097787b1f3c7f9b4f8dca68f66d474eaa60b1d1

Comment by Pete Brumm [ 24/Aug/11 ]

Our 9 hour test completed without any mongodb restarts.

There were some primary switches and some rollbacks which we are looking at the logs now for more info.

but the original issue above did not occur with 1.9-nightly

Comment by Pete Brumm [ 23/Aug/11 ]

we ran for 5:40 hours with no problems. our environment had some issues which we think are unrelated.

We did not see any restarts.

we are kicking off another now to see if we can get 8 hours in

Comment by Pete Brumm [ 22/Aug/11 ]

we ran a test for a few hours today and didn't see any restarts. we did see some swapping of who was master. I will look into the logs in the morning to confirm that there weren't any restarts.

we are also just kicking off a 15 hour run so hopefully everything will be up in the morning.

Comment by Dwight Merriman [ 21/Aug/11 ]

thanks – if it happens on 1.9 nightly, we can add some extra logging to hopefully clarify what is happening.

Comment by Pete Brumm [ 21/Aug/11 ]

We were trying to get a long running test going but ran into unrelated network issues. We will run it again on Monday. I will send an update on monday

Comment by Mathias Stearn [ 19/Aug/11 ]

@Pete Were you able to complete testing?

Comment by Pete Brumm [ 16/Aug/11 ]

We will kick off a test today.

-pete

Comment by Mathias Stearn [ 15/Aug/11 ]

Could you test with the latest 1.9 nightly? There have been some fixes in that branch for windows-specific vm behavior.

Comment by Pete Brumm [ 01/Aug/11 ]

We were able to reproduce over the weekend with 1.8.2

Sat Jul 30 19:26:42 [replica set sync] MapViewOfFileEx failed C:/mongodb/Data/db/local.ns errno:487 Attempt to access invalid address.
Sat Jul 30 19:26:42 [replica set sync] Assertion failure p db\mongommf.cpp 198
Sat Jul 30 19:26:42 [replica set sync] replSet syncThread: 0 assertion db\mongommf.cpp:198
Sat Jul 30 20:17:57 [replica set sync] MapViewOfFileEx failed C:/mongodb/Data/db/blue_spruce.ns errno:487 Attempt to access invalid address.
Sat Jul 30 20:17:57 [replica set sync] Assertion failure p db\mongommf.cpp 198
Sat Jul 30 20:17:57 [replica set sync] replSet syncThread: 0 assertion db\mongommf.cpp:198
Sat Jul 30 22:24:44 [replica set sync] MapViewOfFileEx failed C:/mongodb/Data/db/local.ns errno:487 Attempt to access invalid address.
Sat Jul 30 22:24:44 [replica set sync] Assertion failure p db\mongommf.cpp 198
Sat Jul 30 22:24:44 [replica set sync] replSet syncThread: 0 assertion db\mongommf.cpp:198
Sat Jul 30 23:05:39 [replica set sync] MapViewOfFileEx failed C:/mongodb/Data/db/blue_spruce.ns errno:487 Attempt to access invalid address.
Sat Jul 30 23:05:39 [replica set sync] Assertion failure p db\mongommf.cpp 198
Sat Jul 30 23:05:39 [replica set sync] replSet syncThread: 0 assertion db\mongommf.cpp:198
Sun Jul 31 00:36:31 [replica set sync] MapViewOfFileEx failed C:/mongodb/Data/db/local.ns errno:487 Attempt to access invalid address.
Sun Jul 31 00:36:31 [replica set sync] Assertion failure p db\mongommf.cpp 198
Sun Jul 31 00:36:31 [replica set sync] replSet syncThread: 0 assertion db\mongommf.cpp:198
a

we had 40 concurrent clients inserting/updating a collection.
it took 6 hours for it to occur first. and then happened every 30 min or so after that.

Do you have a build with your fix above? we could run our tests against it to see if it resolves the issue.

we were running with verbose=true in the config file

Comment by Pete Brumm [ 29/Jul/11 ]

this occurred for me on 1.8.1

Thu Jul 28 19:44:56 [conn3059] MapViewOfFileEx failed C:/mongodb/Data/db/local.0 errno:487 Attempt to access invalid address.
Thu Jul 28 19:44:56 [conn3059] Assertion failure p db\mongommf.cpp 198
Thu Jul 28 19:44:57 [conn3059] insert qa.system.indexes exception 0 assertion db\mongommf.cpp:198 0ms

on windows 2008 sp1
on a san drive quad core
with journaling

we are upgrading to 1.8.2 to see if this fix resolves the issue.

we were doing heavy concurrent inserts and updates to a collection

Comment by auto [ 15/Apr/11 ]

Author:

{u'login': u'dwight', u'name': u'Dwight', u'email': u'dwight@10gen.com'}

Message: SERVER-2942 attempt to fix problem with remapPrivateView on windows
Branch: master
https://github.com/mongodb/mongo/commit/a92aef4e153f675d77ef0cc01816803cb9145df2

Comment by Dwight Merriman [ 15/Apr/11 ]

workarounds:

  • loading the bulk data with just one connection instead of multiple, the problem might stop manifesting
  • loading the bulk data with --journal off should work too

real solution to come

Comment by Conor McCarthy [ 14/Apr/11 ]

> researching. do you need workarounds in the meantime?

Thank you for the offer. I will be OK without a work around. Just testing the system at this stage.

Comment by Mathias Stearn [ 13/Apr/11 ]

We know where the issue is. Still looking for the best solution.

Comment by Dwight Merriman [ 13/Apr/11 ]

researching. do you need workarounds in the meantime?

Comment by Conor McCarthy [ 13/Apr/11 ]

Some more info:

The C# code to do the inserts looks like:
MongoCollection<TestDoc> dbTestDoc = db.GetCollection<TestDoc>("TestDoc");
BsonDocument[] batch = new BsonDocument[items.Count()];
dbHands.InsertBatch(items);

where 'items' is a List of 10,000 documents, about 10MB in size.

Multiple (3) processes were inserting into the DB simultaneously.

Comment by Conor McCarthy [ 13/Apr/11 ]

The error was reproduced with the DB running with the -vvvv option.

Log below:

Wed Apr 13 08:59:28 [dur] old journal file will be removed: H:/mongo/data/journa
l/j._44
Wed Apr 13 08:59:28 [conn9] insert TestDB.TestDoc 796ms
Wed Apr 13 08:59:29 [conn8] insert TestDB.TestDoc 718ms
Wed Apr 13 08:59:30 [conn6] insert TestDB.TestDoc 843ms
Wed Apr 13 08:59:31 [conn9] insert TestDB.TestDoc 734ms
Wed Apr 13 08:59:32 [conn8] insert TestDB.TestDoc 781ms
Wed Apr 13 08:59:33 [conn6] insert TestDB.TestDoc 890ms
Wed Apr 13 08:59:34 [conn9] insert TestDB.TestDoc 875ms
Wed Apr 13 08:59:34 [conn8] insert TestDB.TestDoc 765ms
Wed Apr 13 08:59:35 [conn6] insert TestDB.TestDoc 859ms
Wed Apr 13 08:59:36 [conn9] insert TestDB.TestDoc 781ms
Wed Apr 13 08:59:37 [conn8] insert TestDB.TestDoc 718ms
Wed Apr 13 08:59:38 [conn6] insert TestDB.TestDoc 875ms
Wed Apr 13 08:59:39 [conn9] insert TestDB.TestDoc 859ms
Wed Apr 13 08:59:40 [conn8] insert TestDB.TestDoc 734ms
Wed Apr 13 08:59:51 [conn9] insert TestDB.TestDoc 10766ms
Wed Apr 13 08:59:52 [DataFileSync] flushing mmap took 12188ms for 110 files
Wed Apr 13 08:59:53 [conn6] insert TestDB.TestDoc 1859ms
Wed Apr 13 08:59:54 [conn8] insert TestDB.TestDoc 703ms
Wed Apr 13 08:59:55 [conn6] insert TestDB.TestDoc 765ms
Wed Apr 13 08:59:55 [conn8] insert TestDB.TestDoc 671ms
Wed Apr 13 08:59:55 [dur] lsn set 33109301
Wed Apr 13 08:59:56 [conn9] insert TestDB.TestDoc 734ms
Wed Apr 13 08:59:57 [conn6] insert TestDB.TestDoc 906ms
Wed Apr 13 08:59:58 [conn9] insert TestDB.TestDoc 734ms
Wed Apr 13 08:59:59 [conn8] insert TestDB.TestDoc 750ms
Wed Apr 13 09:00:00 [conn6] insert TestDB.TestDoc 953ms
Wed Apr 13 09:00:01 [conn8] insert TestDB.TestDoc 953ms
Wed Apr 13 09:00:01 [conn9] insert TestDB.TestDoc 734ms
Wed Apr 13 09:00:02 [conn6] insert TestDB.TestDoc 953ms
Wed Apr 13 09:00:03 [conn8] insert TestDB.TestDoc 703ms
Wed Apr 13 09:00:04 [conn9] insert TestDB.TestDoc 765ms
Wed Apr 13 09:00:05 [conn6] insert TestDB.TestDoc 1000ms
Wed Apr 13 09:00:06 [conn8] insert TestDB.TestDoc 671ms
Wed Apr 13 09:00:06 [conn9] insert TestDB.TestDoc 734ms
Wed Apr 13 09:00:08 [conn6] insert TestDB.TestDoc 968ms
Wed Apr 13 09:00:09 [conn8] insert TestDB.TestDoc 984ms
Wed Apr 13 09:00:09 [conn9] insert TestDB.TestDoc 828ms
Wed Apr 13 09:00:10 [conn6] allocating new extent for TestDB.TestDoc padding:1 len
WHdr: 1399
Wed Apr 13 09:00:11 [conn6] insert TestDB.TestDoc 1078ms
Wed Apr 13 09:00:11 [conn8] insert TestDB.TestDoc 906ms
Wed Apr 13 09:00:12 [conn9] insert TestDB.TestDoc 828ms
Wed Apr 13 09:00:13 [conn6] insert TestDB.TestDoc 984ms
Wed Apr 13 09:00:14 [conn8] insert TestDB.TestDoc 687ms
Wed Apr 13 09:00:15 [conn9] insert TestDB.TestDoc 796ms
Wed Apr 13 09:00:15 [conn6] MapViewOfFileEx failed H:/mongo/data/TestDB.ns errno
:487 Attempt to access invalid address.
Wed Apr 13 09:00:15 [conn6] TestDB.TestDoc Assertion failure p db\mongommf.cpp 19
8
Wed Apr 13 09:00:15 [conn6] dbexception in groupCommit causing immediate shutdow
n: 0 assertion db\mongommf.cpp:198
Wed Apr 13 09:00:45 [DataFileSync] flushing mmap took 5047ms for 111 files
Wed Apr 13 09:01:40 [DataFileSync] flushing mmap took 406ms for 111 files
Wed Apr 13 09:02:40 [DataFileSync] flushing mmap took 406ms for 111 files
Wed Apr 13 09:03:40 [DataFileSync] flushing mmap took 406ms for 111 files
Wed Apr 13 09:04:40 [DataFileSync] flushing mmap took 406ms for 111 files
Wed Apr 13 09:05:40 [DataFileSync] flushing mmap took 406ms for 111 files

Comment by Conor McCarthy [ 13/Apr/11 ]

I should add that I was using the latest mongo-csharp-driver to interface with the db.

Generated at Thu Feb 08 03:01:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.