[SERVER-4300] "Assertion: 10000:out of memory BufBuilder" then crashed on Windows 7 64bits Created: 17/Nov/11  Updated: 15/Aug/12  Resolved: 04/Mar/12

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Huang Assignee: Tad Marshall
Resolution: Duplicate Votes: 1
Labels: SERVER_V2, crash, insert, update
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File error_log_20111112.txt     Text File error_log_20111112.txt     Text File error_log_20111124.txt    
Issue Links:
Duplicate
duplicates SERVER-5194 Windows version of mongod should mana... Closed
Operating System: Windows
Participants:

 Description   

my server is running Windows 7 64bits,24GB RAM, MongoDB2.0.1 with journaling enable.

i use c# driver to insert data into MongoDB, each insert data is about 2MB. After a while, MongoDB consumed all of 24GB memory, then out of memory dialog of Windows pops up, I turned the dialog off(twice), then MongoDb was alive for a while then crashed.

The virtual memory(pagefile size) of Windows is 73595MB, it is managed automatically by Windows

console message:

Thu Nov 17 09:23:20 [conn1817] update surveyData.photos query:

{ _id: 1099111116065824545 }

update: { _id: 1099111116065824545, sqliteId: "129495", gps_time: new Date(1321397904545), loc:

{ lon: 121.640237, lat: 25.061985 }

, heading: 0.0, gps_accuracy: 0.0, velocity: 0.0, cam_0: "285757496", cam_0_blob: BinData, cam_1: "285876579", cam_1_blob: BinData, cam_2: "285876590", cam_2_blob: BinData, cam_3: "285885861", cam_3_blob: BinData, cam_4: "285876226", cam_4_blob: BinData, cam_5: "285885859", cam_5_blob: BinData, cam_6: "285876230", cam_6_blob: BinData,cam_7: "285757478", cam_7_blob: BinData, cam_8: "285757493", cam_8_blob: BinData, created_at: new Date(1321444899758), mod_at: new Date(1321493204950) } idhack:1 234ms
Thu Nov 17 09:23:21 [conn1817] update surveyData.photos query:

{ _id: 1099111116065824894 }

update: { _id: 1099111116065824894, sqliteId: "129496", gps_time: new Date(1321397904894), loc:

{ lon: 121.640237, lat: 25.061985 }

, heading: 0.0, gps_accuracy: 0.0, velocity: 0.0, cam_0: "285757496", cam_0_blob: BinData, cam_1: "285876579", cam_1_blob: BinData, cam_2: "285876590", cam_2_blob: BinData, cam_3: "285885861", cam_3_blob: BinData, cam_4: "285876226", cam_4_blob: BinData, cam_5: "285885859", cam_5_blob: BinData, cam_6: "285876230", cam_6_blob: BinData,cam_7: "285757478", cam_7_blob: BinData, cam_8: "285757493", cam_8_blob: BinData, created_at: new Date(1321444900171), mod_at: new Date(1321493205579) } idhack:1 171ms
Thu Nov 17 09:23:21 [conn1817] end connection 10.77.1.34:63669
Thu Nov 17 09:23:22 [conn1815] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 09:23:22 [conn1815] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116065819550 }

Thu Nov 17 09:23:22 [conn1815] end connection 10.77.1.34:63665
Thu Nov 17 09:23:26 [conn1819] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 09:23:26 [conn1819] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116065819887 }

Thu Nov 17 09:23:26 [conn1819] end connection 10.77.1.34:63676
Thu Nov 17 09:23:29 [initandlisten] connection accepted from 10.77.1.34:64147 #1864
Thu Nov 17 09:23:29 [conn1864] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 09:23:29 [conn1864] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116065820249 }

Thu Nov 17 09:23:29 [conn1864] end connection 10.77.1.34:64147
Thu Nov 17 09:23:31 [initandlisten] connection accepted from 10.77.1.34:64148 #1865
Thu Nov 17 09:24:16 [clientcursormon] mem (MB) res:2036 virt:217008 mapped:108397
.
.
.
(turn off Windows "out of memory" warning dialog,keep iusert/update data)
.
.
.
Thu Nov 17 09:59:25 [conn1873] update surveyData.photos query:

{ _id: 1099111116064457111 }

update: { _id: 1099111116064457111, sqliteId: "127201", gps_time: new Date(1321397097111), loc:

{ lon: 121.640238, lat: 25.061983 }

, heading: 0.0, gps_accuracy: 0.0, velocity: 0.0, cam_0: "285757496", cam_0_blob: BinData, cam_1: "285876579", cam_1_blob: BinData, cam_2: "285876590", cam_2_blob: BinData, cam_3: "285885861", cam_3_blob: BinData, cam_4: "285876226", cam_4_blob: BinData, cam_5: "285885859", cam_5_blob: BinData, cam_6: "285876230", cam_6_blob: BinData,cam_7: "285757478", cam_7_blob: BinData, cam_8: "285757493", cam_8_blob: BinData, created_at: new Date(1321445789421), mod_at: new Date(1321495365770) } idhack:1 483ms
Thu Nov 17 09:59:25 [conn1873] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 09:59:25 [conn1873] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116064457454 }

Thu Nov 17 09:59:25 [conn1873] end connection 10.77.1.34:65286
Thu Nov 17 09:59:52 [conn1886] VirtualProtect failed (mcw) e:/db/panorama.27 20381fd8000000 4000000 errno:1455 ??瑼云撠⊥?摰???
Thu Nov 17 09:59:52 [conn1886] panorama.Pano Assertion failure false db\mongommf.cpp 72
Thu Nov 17 09:59:52 [conn1886] insert panorama.Pano exception: assertion db\mongommf.cpp:72 31ms
.
.
.
(turn off Windows "out of memory" warning dialog,keep iusert/update data)
.
.
.
Thu Nov 17 10:03:21 [conn1899] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 10:03:21 [conn1900] Assertion: 10000:out of memory BufBuilder
Thu Nov 17 10:03:21 [conn1899] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116064325374 }

Thu Nov 17 10:03:21 [conn1900] assertion 10000 out of memory BufBuilder ns:surveyData.photos query:

{ _id: 1099111116064330919 }

Thu Nov 17 10:03:21 [conn1899] end connection 10.77.1.34:49382
Thu Nov 17 10:03:21 [conn1900] end connection 10.77.1.34:49383
Thu Nov 17 10:03:23 dbexit: malloc fails
Thu Nov 17 10:03:23 [conn1897] shutdown: going to close listening sockets...
Thu Nov 17 10:03:23 [conn1897] closing listening socket: 448
Thu Nov 17 10:03:23 [initandlisten] now exiting
Thu Nov 17 10:03:23 dbexit: ; exiting immediately
Thu Nov 17 10:03:23 [conn1897] closing listening socket: 452
Thu Nov 17 10:03:23 [conn1897] shutdown: going to flush diaglog...
Thu Nov 17 10:03:23 [conn1897] shutdown: going to close sockets...
Thu Nov 17 10:03:23 [conn1897] shutdown: waiting for fs preallocator...
Thu Nov 17 10:03:23 [conn1897] shutdown: lock for final commit...

There is no such issue if I turn off journaling function, but I need it to reduce time consuming for repairDatabase



 Comments   
Comment by Tad Marshall [ 24/Mar/12 ]

The Windows version of MongoDB is using up space in the page file, and this can cause out-of-memory failures anywhere in the system, including in mongod.exe itself. Turning off journaling will prevent this problem, but this is not a good idea for most use cases. All of the "out of memory BufBuilder" errors are simply out-of-memory issues, and all further problems are a result of the low-memory condition.

Comment by Tad Marshall [ 04/Mar/12 ]

Consolidating closely related issues into one.

Comment by Tad Marshall [ 01/Dec/11 ]

In starting to talk to Microsoft about these issues (memory usage, degraded performance) they had some suggestions for research:

Can you please confirm what version of server your customers are using? Is this on Windows Server 2003?

Below is the info on how to get tracing info – this should help investigate the issue

1) Install Xperf on the server with the issue. (it comes with the Windows Performance Toolkit from the Windows SDK)

2) On a command prompt, launch the following command
a. Xperf.exe –on base+latency+fileio –stackwalk profile+filewrite+fileread+fileflush

3) Reproduce the issue you’re facing

4) Run the following command:
a. Xperf.exe –d mytrace.etl

5) View the trace: xperf.exe mytrace.etl

I haven't tried this yet, but I will and will report what I learn. You could try it on your machines and see if it tells you something useful, thanks!

Comment by David Huang [ 24/Nov/11 ]

Hello Scott

My colleague and I test again with fixed size page file. We inserted a lot of binary data into MongoDB, we reproduced error 1(the OS returned an error because of memory usage and closed mongod) and error 2(MapViewOfFile failed issue) again. I attached another log (error_log_20111124.txt) which is about MapViewOfFile failed issue.

My Ruby code is to read file paths from a file, read the eight target files and save them as a BSON document, then insert the BSON document into MongoDB. Each file is about 100~200K, so it means each BSON document is about 1MB or more.

My colleague did the same thing but he uses C# driver.

Ruby code is something like this:

require 'date'
require 'mongo'

connection = Mongo::Connection.new("10.77.1.82",27017)
db = Mongo::Connection.new("10.77.1.82",27017).db("survey_photos")
photos = db.collection('photos')

file_0 = File.open("PhotosList.txt")

file_0.uber_gets("\n") do |s1|
_photo_path_0 = s1.split(",")[0] #file path for photo 1
_photo_path_1 = s1.split(",")[1] #file path for photo 2
_photo_path_2 = s1.split(",")[2]
_photo_path_3 = s1.split(",")[3]
_photo_path_4 = s1.split(",")[4]
_photo_path_5 = s1.split(",")[5]
_photo_path_6 = s1.split(",")[6]
_photo_path_7 = s1.split(",")[7]

mem_buf_0 = File.open(_photo_path_0,"rb")

{|io| io.read}
mem_buf_1 = File.open(_photo_path_1,"rb") {|io| io.read}

mem_buf_2 = File.open(_photo_path_2,"rb")

{|io| io.read}
mem_buf_3 = File.open(_photo_path_3,"rb") {|io| io.read}

mem_buf_4 = File.open(_photo_path_4,"rb")

{|io| io.read}
mem_buf_5 = File.open(_photo_path_5,"rb") {|io| io.read}

mem_buf_6 = File.open(_photo_path_6,"rb")

{|io| io.read}
mem_buf_7 = File.open(_photo_path_7,"rb") {|io| io.read}

new_photo =

{ "cam_0" => BSON::Binary.new(mem_buf_0), "cam_1" => BSON::Binary.new(mem_buf_1), "cam_2" => BSON::Binary.new(mem_buf_2), "cam_3" => BSON::Binary.new(mem_buf_3), "cam_4" => BSON::Binary.new(mem_buf_4), "cam_5" => BSON::Binary.new(mem_buf_5), "cam_6" => BSON::Binary.new(mem_buf_6), "cam_7" => BSON::Binary.new(mem_buf_7), "upload_time" => Time.now }

photo_id = photos.insert(new_photo)
end

Comment by Scott Hernandez (Inactive) [ 23/Nov/11 ]

I can reproduce the windows memory protection dialog case but not the last one you had. Can you provide your code to test with?

When I set my page file to a fixed size and ran some tests it ran for a while until the OS returned an error because of memory usage and closed mongod.

I am testing on Windows 7 (Ultimate-ENG)

Comment by David Huang [ 22/Nov/11 ]

I did another test and attached the log file.
Sorry about some weird characters in the log file, because I am using Traditional Chinese version of Windows 7.

This time the log shows more information, it shows "MapViewOfFile failed" before malloc fails then crashed...no Windows warning dialog.

Comment by David Huang [ 22/Nov/11 ]

YES

Comment by Scott Hernandez (Inactive) [ 21/Nov/11 ]

Thanks for testing, just to be clear, in the last run you had journaling on, correct?

Comment by David Huang [ 21/Nov/11 ]

bad news..in 7th hour, monogd crashed, NO warning dialog this time.

.
.
.
Mon Nov 21 16:38:25 [journal] old journal file will be removed: e:/db/journal/j._171
Mon Nov 21 16:38:41 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took:15ms
Mon Nov 21 16:38:41 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 31ms
Mon Nov 21 16:39:41 dbexit: malloc fails
Mon Nov 21 16:39:41 dbexit: malloc fails; exiting immediately
Mon Nov 21 16:39:41 [conn600] shutdown: going to close listening sockets...
Mon Nov 21 16:39:41 [initandlisten] now exiting
Mon Nov 21 16:39:41 [conn600] closing listening socket: 444
Mon Nov 21 16:39:41 [conn600] closing listening socket: 448
Mon Nov 21 16:39:41 [conn600] shutdown: going to flush diaglog...
Mon Nov 21 16:39:41 [conn600] shutdown: going to close sockets...
Mon Nov 21 16:39:41 [conn803] got request after shutdown()
Logstream::get called in uninitialized state
Mon Nov 21 16:39:42 [conn600] shutdown: waiting for fs preallocator...
Logstream::get called in uninitialized state
Mon Nov 21 16:39:42 [conn600] shutdown: lock for final commit...
Logstream::get called in uninitialized state
Mon Nov 21 16:39:43 [conn809] got request after shutdown()
Logstream::get called in uninitialized state
Logstream::get called in uninitialized state
Mon Nov 21 16:39:43 [conn810] end connection 10.77.1.34:54714
Mon Nov 21 16:39:43 [conn822] end connection 10.77.1.34:54830
Logstream::get called in uninitialized state
Mon Nov 21 16:39:43 [conn808] end connection 10.77.1.58:3693

c:\mongodb-win32-x86_64-2.0.1\bin>

Comment by David Huang [ 21/Nov/11 ]

Hello all

I have used fixed page file size(160GB) and inserted 85071 objects(Data Size 130.41g) last six hours.
System is stable, so far so good. It looks like I have to use fixed size page file, not ranged size nor windows automatically managed.

I will keep updating my testing result.

Comment by David Huang [ 17/Nov/11 ]

I have set the initial page file size to 80GB and max size to 160GB, the problem still exists.

The c# client received "Assertion: 10000" error message again, but MongoDB server doesn't show the "Assertion: 10000" error message this time, just the out of memory warning dialog.

The virtual memory configuration dialog of Windows indicates that it is using 160GB of memory while out of memory warning dialog shows up.

I will use fixed size page file (initial page file size 160GB and max size 160GB) for next testing

Comment by Scott Hernandez (Inactive) [ 17/Nov/11 ]

Please test using a fixed size page file, not a dynamic one (windows managed).

Comment by Eliot Horowitz (Inactive) [ 17/Nov/11 ]

Can you try increasing the windows page file?

Generated at Thu Feb 08 03:05:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.