[SERVER-4893] Mongo 2.0.2 crashes when a large file is written to GridFS : "Assertion failure a <= 512*1024*1024 util/alignedbuilder.cpp" Created: 07/Feb/12  Updated: 29/Apr/13  Resolved: 15/Apr/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Ilya Katsov Assignee: Mathias Stearn
Resolution: Incomplete Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS 5.7, single-node deployment, 32GB RAM, 16 cores, 100GB Vertex SSD drive


Attachments: File log.7z    
Issue Links:
Related
Operating System: Linux
Participants:

 Description   

The following code is used to save 1GB file to GridFS:

     public <T> void saveMeta(String key, T data) {
        GridFS fileSpace = getFileSpace(META_SPACE_PREFIX); // create gridFS: ... fs = new GridFS(mongoDB, fsName); ...
 
        String journalFilename = key + "_JOURNAL";
        GridFSInputFile journalFile = fileSpace.createFile(journalFilename);    // some kind of pre-commit, write new version of data to a 'journal' file 
 
        try {
            journalFile.getOutputStream().write(SerializationUtil.serialize(data));
            journalFile.getOutputStream().close();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
 
        deleteFromStorage(key, META_SPACE_PREFIX);    // delete old version using GridFS#remove(key);
 
        GridFSInputFile file = fileSpace.createFile(key);  
 
        try {
            file.getOutputStream().write(SerializationUtil.serialize(data));  // write new data
            file.getOutputStream().close();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
 
        deleteFromStorage(journalFilename, META_SPACE_PREFIX);  // delete the journal file
    }

This constantly crashes server with the following problem:

Tue May 10 22:50:54 [clientcursormon] mem (MB) res:1189 virt:53753 mapped:26564
Tue May 10 22:53:18 [journal] old journal file will be removed: /var/lib/mongo/journal/j._0
Tue May 10 22:54:30 [journal]   warning assertion failure a <= 256*1024*1024 util/alignedbuilder.cpp 90
0x57a926 0x58434a 0x75b2d3 0x768b0a 0x76920f 0x769445 0x760f0f 0x76160d 0x76192d 0x7620bb 0xaa80b0 0x3705e0673d 0x37056d44bd 
 /usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a926]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x58434a]
 /usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x75b2d3]
 /usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x768b0a]
 /usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76920f]
 /usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x769445]
 /usr/bin/mongod(_ZN5mongo3dur28_groupCommitWithLimitedLocksEv+0x15f) [0x760f0f]
 /usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0x1d) [0x76160d]
 /usr/bin/mongod [0x76192d]
 /usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x10b) [0x7620bb]
 /usr/bin/mongod(thread_proxy+0x80) [0xaa80b0]
 /lib64/libpthread.so.0 [0x3705e0673d]
 /lib64/libc.so.6(clone+0x6d) [0x37056d44bd]
Tue May 10 22:54:38 [conn4] insert red-aril.dimensionsMETA.chunks 6385ms
Tue May 10 22:54:40 [FileAllocator] allocating new datafile /var/lib/mongo/test-base.18, filling with zeroes...
Tue May 10 22:54:40 [conn4] insert red-aril.dimensionsMETA.chunks 1153ms
Tue May 10 22:54:42 [conn4] old journal file will be removed: /var/lib/mongo/journal/j._1
Tue May 10 22:54:56 [FileAllocator] done allocating datafile /var/lib/mongo/test-basel.18, size: 2047M
 
Tue May 10 22:54:56 [conn4] DR101 latency warning on journal file open 5473ms
Tue May 10 22:54:56 [conn4] insert red-aril.dimensionsMETA.chunks 14983ms
Tue May 10 22:54:59 [conn4] insert red-aril.dimensionsMETA.chunks 2455ms
Tue May 10 22:55:01 [conn4] remove red-aril.dimensionsMETA.chunks query: { files_id: ObjectId('4dca0b39e4b0a48e8532560b') } 721ms
Tue May 10 22:55:13 [conn4]   warning assertion failure a <= 256*1024*1024 util/alignedbuilder.cpp 90
0x57a926 0x58434a 0x75b2d3 0x768b0a 0x76920f 0x769445 0x75f179 0x7640da 0x888808 0x88ac95 0xaa00c6 0x635bb7 0x3705e0673d 0x37056d44bd 
 /usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a926]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x58434a]
 /usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x75b2d3]
 /usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x768b0a]
 /usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76920f]
 /usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x769445]
 /usr/bin/mongod [0x75f179]
 /usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x7640da]
 /usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x888808]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1155) [0x88ac95]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa00c6]
 /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x635bb7]
 /lib64/libpthread.so.0 [0x3705e0673d]
 /lib64/libc.so.6(clone+0x6d) [0x37056d44bd]
Tue May 10 22:55:13 [conn4] rate limiting wassert
Tue May 10 22:55:14 [conn4]   Assertion failure a <= 512*1024*1024 util/alignedbuilder.cpp 91
0x57a926 0x5857db 0x75b2ed 0x768b0a 0x76920f 0x769445 0x75f179 0x7640da 0x888808 0x88ac95 0xaa00c6 0x635bb7 0x3705e0673d 0x37056d44bd 
 /usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a926]
 /usr/bin/mongod(_ZN5mongo8assertedEPKcS1_j+0xfb) [0x5857db]
 /usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x7d) [0x75b2ed]
 /usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x768b0a]
 /usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76920f]
 /usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x769445]
 /usr/bin/mongod [0x75f179]
 /usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x7640da]
 /usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x888808]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1155) [0x88ac95]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa00c6]
 /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x635bb7]
 /lib64/libpthread.so.0 [0x3705e0673d]
 /lib64/libc.so.6(clone+0x6d) [0x37056d44bd]
Tue May 10 22:55:14 [conn4] dbexception in groupCommit causing immediate shutdown: 0 assertion util/alignedbuilder.cpp:91
Tue May 10 22:55:14 gc1
Tue May 10 22:55:14 Got signal: 6 (Aborted).
 
Tue May 10 22:55:14 Backtrace:
0xa8d669 0x37056302d0 0x3705630265 0x3705631d10 0x881427 0x75f4a1 0x7640da 0x888808 0x88ac95 0xaa00c6 0x635bb7 0x3705e0673d 0x37056d44bd 
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0xa8d669]
 /lib64/libc.so.6 [0x37056302d0]
 /lib64/libc.so.6(gsignal+0x35) [0x3705630265]
 /lib64/libc.so.6(abort+0x110) [0x3705631d10]
 /usr/bin/mongod(_ZN5mongo10mongoAbortEPKc+0x47) [0x881427]
 /usr/bin/mongod [0x75f4a1]
 /usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x7640da]
 /usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x888808]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1155) [0x88ac95]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa00c6]
 /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x635bb7]
 /lib64/libpthread.so.0 [0x3705e0673d]
 /lib64/libc.so.6(clone+0x6d) [0x37056d44bd]
 
Logstream::get called in uninitialized state
Tue May 10 22:55:14 ERROR: Client::shutdown not called: conn



 Comments   
Comment by Mathias Stearn [ 11/Apr/13 ]

Has anyone been able to repro with 2.4? I just tested uploading 10,000 files (both large and small) to gridfs and was not able to get this warning to occur. Some of the changes to how journalling is handled in 2.2 for db-level-locking should have helped prevent this from occurring.

Comment by Ben Becker [ 17/Aug/12 ]

Hi Maxime,

That should indeed work, but we still need to address the core issue. The GridFS use case you mentioned may help us reproduce; thanks for the info. Please do let me know if you encounter this issue with SafeMode.Journal enabled.

Best Regards,
Ben

Comment by Maxime Beaudry [ 15/Aug/12 ]

Hi Ben,

this issues was encountered when inserting multiple files to GridFS. The files are inserted in a single threaded tight loop using the Mongo C# driver. The issue was raised when uploading about 56000 files that range from 0 bytes to 514 Mb. Note that only 410 files out of 56000 were larger than 256 kb (the GridFS chunk size).

I tried to use the workaround suggested by Jose but unfortunately the GetLastError method of the C# driver does not support the "j" option. I therefore changed my SafeMode object used when opening the database to have a similar behavior. Here is the code:

var safeMode = new SafeMode(true);
safeMode.Journal = true;

var connectionString = "mongodb://localhost";
m_mongoServer = MongoServer.Create(connectionString);
m_mongoDatabase = m_mongoServer.GetDatabase("myDb", safeMode);

I am currently trying again with this new `safeMode` and I can see that the journal file is always empty. I will let you know the outcome.

Comment by Jose Sebastian Battig [ 14/Aug/12 ]

Having the same issue, this "solves" the problem from the client side of mongo running out of memory due to writer not catching up:

...
    i = 0;
    while( ( n = fread( buffer, 1, READ_WRITE_BUF_SIZE, fd ) ) != 0 ) {
        gridfile_write_buffer( gfile, buffer, n );     
        if(i++ % 10 == 0) { /* Let's check every ten rounds */
          bson_init( &lastErrorCmd );
          bson_append_int( &lastErrorCmd, "getLastError", 1);
          bson_append_int( &lastErrorCmd, "j", 1); /* With this we tell Mongo to wait for Journal writing */
          bson_finish( &lastErrorCmd );
 
          bson_init( &lastError );
          mongo_run_command( conn, "test", &lastErrorCmd, &lastError );
 
          bson_destroy( &lastError );
          bson_destroy( &lastErrorCmd );
        }
    }
...

Comment by Ben Becker [ 13/Aug/12 ]

Hi Maxime,

Was this encountered while inserting a single gridfs file or multiple files (concurrently)? Also, how large were these files?

Thanks,
Ben

Comment by Maxime Beaudry [ 11/Aug/12 ]

Hi Ben and Michel,

I am currently evaluating MongoDb and I had a very similar crash. Here is the output of my mongod.exe process on Windows 2008 R2 (x64) when I try to insert lots and lots of data through the C# GridFS API:

Fri Aug 10 16:24:12 [conn2339] command DIT.$cmd command:

{ filemd5: ObjectId('50256deb4921f623207b2ecb'), root: "fs" }

ntoreturn:1 reslen:94 280ms
Fri Aug 10 16:24:13 [journal] warning assertion failure a <= 256*1024*1024 util\alignedbuilder.cpp 90
Fri Aug 10 16:24:15 [initandlisten] connection accepted from 127.0.0.1:57699 #2405
Fri Aug 10 16:24:15 [conn2405] end connection 127.0.0.1:57699
Fri Aug 10 16:24:22 [journal] warning assertion failure a <= 256*1024*1024 util\alignedbuilder.cpp 90
Fri Aug 10 16:24:25 [initandlisten] connection accepted from 127.0.0.1:57702 #2406
Fri Aug 10 16:24:25 [conn2406] end connection 127.0.0.1:57702
Fri Aug 10 16:24:35 [conn2339] command DIT.$cmd command:

{ filemd5: ObjectId('50256dec4921f623207b2fb0'), root: "fs" }

ntoreturn:1 reslen:94 1778ms
Fri Aug 10 16:24:35 [initandlisten] connection accepted from 127.0.0.1:57706 #2407
Fri Aug 10 16:24:35 [conn2407] end connection 127.0.0.1:57706
Fri Aug 10 16:24:38 [conn2339] warning assertion failure a <= 256*1024*1024 util\alignedbuilder.cpp 90
Fri Aug 10 16:24:45 [initandlisten] connection accepted from 127.0.0.1:57709 #2408
Fri Aug 10 16:24:45 [conn2408] end connection 127.0.0.1:57709
Fri Aug 10 16:24:53 [conn2339] warning assertion failure a <= 256*1024*1024 util\alignedbuilder.cpp 90
Fri Aug 10 16:24:55 [initandlisten] connection accepted from 127.0.0.1:57711 #2409
Fri Aug 10 16:24:55 [conn2409] end connection 127.0.0.1:57711
Fri Aug 10 16:25:02 [conn2339] warning assertion failure a <= 256*1024*1024 util\alignedbuilder.cpp 90
Fri Aug 10 16:25:02 [conn2339] Assertion failure a <= 512*1024*1024 util\alignedbuilder.cpp 91
Fri Aug 10 16:25:02 [conn2339] dbexception in groupCommit causing immediate shutdown: 0 assertion util\alignedbuilder.cpp:91
Fri Aug 10 16:25:02 gc1

Have you guys made any progress on this bug? Can I help in some way? If this is not fixed, is there any work around that you suggest?

Comment by Eliot Horowitz (Inactive) [ 03/Jul/12 ]

@jose - that sounds different. can you open a new ticket? also, what version on windows?

Comment by Jose Sebastian Battig [ 03/Jul/12 ]

I'm having same exact problem. virtualized VM Windows 2008R264 bits. Tried scenario with limited RAM available to MongoD as well as unlimited ability to allocate RAM. Mongo will get up to 5gig of RAM until the DB grows about 8GIG of size it will crash. All attempts so far it crashed more or less at the same place (I can tell exactly, but it was always running the same process after DB grew to the mentioned size).
I'm using GridFS, but it doesn't seem to be a gridFS issue per se, but a problem related to the size of the objects or the rate of insertion.
Anyway, this is a show stopper from getting this to be a production ready system... I'm stress testing it as another person commuted above, and this feels very discouraging that Mongo can't handle slowdowns on IO subsystem, if that's causing it to crash so badly.

Comment by Somsak Sriprayoonsakul [ 08/May/12 ]

We have the same problem. We tried to put a 1GB file into GridFS and, from time to time, mongod crash with similar error message

Platform: CentOS 6.2 on X86_64
Server: 24GB Memory where mongo dbpath on ext4
Mongodb version: db version v2.0.4, pdfile version 4.5
Tue May 8 22:54:18 git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
(installed from debian 6 repository)
NOTE: This machine has OpenVZ VM running in the same machine. The mongod itself didn't run in VM, but the physical host running the VM.

Here is the error message

Tue May 8 22:36:55 [conn4] insert ark.objectcontent.chunks 657ms
Tue May 8 22:36:55 [journal] warning assertion failure a <= 256*1024*1024 util/alignedbuilder.cpp 90
0x57a396 0x583d2a 0x75c683 0x769eba 0x76a5bf 0x76a7f5 0x7622bf 0x7629bd 0x762cdd 0x76346b 0xaab3e0 0x367e8077f1 0x367e4e5ccd
/usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a396]
/usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x583d2a]
/usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x75c683]
/usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x769eba]
/usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76a5bf]
/usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x76a7f5]
/usr/bin/mongod(_ZN5mongo3dur28_groupCommitWithLimitedLocksEv+0x15f) [0x7622bf]
/usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0x1d) [0x7629bd]
/usr/bin/mongod() [0x762cdd]
/usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x10b) [0x76346b]
/usr/bin/mongod(thread_proxy+0x80) [0xaab3e0]
/lib64/libpthread.so.0() [0x367e8077f1]
/lib64/libc.so.6(clone+0x6d) [0x367e4e5ccd]
Tue May 8 22:36:58 [conn4] insert ark.objectcontent.chunks 110ms
Tue May 8 22:36:59 [conn4] insert ark.objectcontent.chunks 113ms
Tue May 8 22:37:00 [clientcursormon] mem (MB) res:3838 virt:406702 mapped:202606
Tue May 8 22:37:12 [journal] old journal file will be removed: /vz/mongodb/journal/j._0
Tue May 8 22:37:14 [journal] DR101 latency warning on journal file open 955ms
Tue May 8 22:37:14 [conn4] warning assertion failure a <= 256*1024*1024 util/alignedbuilder.cpp 90
0x57a396 0x583d2a 0x75c683 0x769eba 0x76a5bf 0x76a7f5 0x760529 0x76548a 0x88c168 0x88e895 0xaa33f6 0x637407 0x367e8077f1 0x367e4e5ccd
/usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a396]
/usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x583d2a]
/usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x75c683]
/usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x769eba]
/usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76a5bf]
/usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x76a7f5]
/usr/bin/mongod() [0x760529]
/usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x76548a]
/usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x88c168]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x11c5) [0x88e895]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa33f6]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x637407]
/lib64/libpthread.so.0() [0x367e8077f1]
/lib64/libc.so.6(clone+0x6d) [0x367e4e5ccd]
Tue May 8 22:37:14 [conn4] rate limiting wassert
Tue May 8 22:37:15 [conn4] Assertion failure a <= 512*1024*1024 util/alignedbuilder.cpp 91
0x57a396 0x5851bb 0x75c69d 0x769eba 0x76a5bf 0x76a7f5 0x760529 0x76548a 0x88c168 0x88e895 0xaa33f6 0x637407 0x367e8077f1 0x367e4e5ccd
/usr/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a396]
/usr/bin/mongod(_ZN5mongo8assertedEPKcS1_j+0xfb) [0x5851bb]
/usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x7d) [0x75c69d]
/usr/bin/mongod(_ZN5mongo3dur21prepBasicWrite_inlockERNS_14AlignedBuilderEPKNS0_11WriteIntentERNS_12RelativePathE+0x2fa) [0x769eba]
/usr/bin/mongod(_ZN5mongo3dur15prepBasicWritesERNS_14AlignedBuilderE+0x6f) [0x76a5bf]
/usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderE+0x75) [0x76a7f5]
/usr/bin/mongod() [0x760529]
/usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x76548a]
/usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x88c168]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x11c5) [0x88e895]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa33f6]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x637407]
/lib64/libpthread.so.0() [0x367e8077f1]
/lib64/libc.so.6(clone+0x6d) [0x367e4e5ccd]
Tue May 8 22:37:15 [conn4] dbexception in groupCommit causing immediate shutdown: 0 assertion util/alignedbuilder.cpp:91
Tue May 8 22:37:15 gc1
Tue May 8 22:37:15 Got signal: 6 (Aborted).

Tue May 8 22:37:15 Backtrace:
0xa90999 0x367e432900 0x367e432885 0x367e434065 0x884737 0x760851 0x76548a 0x88c168 0x88e895 0xaa33f6 0x637407 0x367e8077f1 0x367e4e5ccd
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0xa90999]
/lib64/libc.so.6() [0x367e432900]
/lib64/libc.so.6(gsignal+0x35) [0x367e432885]
/lib64/libc.so.6(abort+0x175) [0x367e434065]
/usr/bin/mongod(_ZN5mongo10mongoAbortEPKc+0x47) [0x884737]
/usr/bin/mongod() [0x760851]
/usr/bin/mongod(_ZN5mongo9writelockD1Ev+0xba) [0x76548a]
/usr/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x468) [0x88c168]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x11c5) [0x88e895]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xaa33f6]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x637407]
/lib64/libpthread.so.0() [0x367e8077f1]
/lib64/libc.so.6(clone+0x6d) [0x367e4e5ccd]

Logstream::get called in uninitialized state
Tue May 8 22:37:15 ERROR: Client::shutdown not called: conn

Comment by Michel Brazeau [ 10/Apr/12 ]

Hi Ben,

I set my page size setting to "manual settings" with a minimum of 8000MB
and maximum of 36000MB and I get the same error,

Tue Apr 10 17:01:09 [conn1] warning assertion failure a <= 256*1024*1024
util\alignedbuilder.cpp 90
Tue Apr 10 17:01:09 [conn1] rate limiting wassert

Then the server shuts down. The log seems identical as before. I'd be very
surprised it's a resource problem.

Michel



Michel Brazeau, B.Eng
Chief Technical Officer
DocuData Software Inc.
www.docudatasoft.com
Phone: 514-789-2789 Ext. 239
Fax: 514-271-4012
Toll Free: 1-866-789-2789 Ext. 239

Comment by Michel Brazeau [ 10/Apr/12 ]

Hi Ben,

I checked all the system logs in the Windows Event Viewer for the date/time
Mon Apr 09 11:40:42, when the fatal error occurred. I don't see any memory
or disk related events and really nothing at that exact time.

As far as the system is concerned it's a core I7 with 8GB of ram and 337GB
free on the hard disk. It's a bit surprising it's a resource problem. And
this error is reproducible all the time, even after a fresh reboot. My
virtual memory page size is indicated as 8183MB in the System Properties |
Advanced | Performance | Settings | Advanced options.

Later today I'll try to run a test with something like 4 times the current
page size to see if that makes a difference.

It is obviously a large stress test. Large records (128Kb) are inserted
very quickly and the database grows from 0 bytes to more than 18Gb in a few
minutes. This is not a production system. Just a stress test on a single
server instance.

Michel

Comment by Ben Becker [ 09/Apr/12 ]

Hi Michel,

Could you check your system event logs for any memory or disk related events? Also, how large is your page file?

In your case it looks like the journal is growing faster than the OS will allow (due to VM or disk contention). This might be indicative of a lack of available resources, in which case I would recommend adding more shards and/or upgrading the server. That being said, we still need to address how mongod.exe encounters and handles this case. I'm still tracking that down and will update this issue as soon as I know more.

Comment by Michel Brazeau [ 09/Apr/12 ]

Hi Ben,

Thanks for the reply.

No I am not using GridFS, just normal inserts but implementing my own
similar to GridFS implementation.

I'm using Delphi with a modified version of
http://code.google.com/p/pebongo/

My test suite has too many internal dependencies to send out as a whole.

I changed my test case since then by reducing the size of data. But this
morning I did revert my code base and ran again. The failure is
reproducible. I attached my full verbose log that leads to the failure.

The error seems to occur twice. On line 159302 it doesn't seem fatal, but
on line, 183644 then the server shuts down.

I hope this helps find the problem. We plan on using MongoDb in production
in a few months for our online document management web app,

http://www.myactiverecords.com

to handle the storage and retrieval of electronic records.

Best regards,

Michel


Michel Brazeau, B.Eng
Chief Technical Officer
DocuData Software Inc.
www.docudatasoft.com
Phone: 514-789-2789 Ext. 239
Fax: 514-271-4012
Toll Free: 1-866-789-2789 Ext. 239

Comment by Ben Becker [ 06/Apr/12 ]

Hi Michel,

When you encountered this, where you writing a GridFS file or executing normal inserts? Is this reproducible? Could you attach the logs?

Thanks!

Comment by Michel Brazeau [ 04/Apr/12 ]

Hi Ben,

I'm getting the same error and the server crashes on Windows 64 bit in a test case inserting many records about 128Kb in size in a large 18GB database.

mongod -version gives:

C:\mongodb\bin>mongod -version
db version v2.0.1, pdfile version 4.5
Wed Apr 04 15:08:33 git version: 3a5cf0e2134a830d38d2d1aae7e88cac31bdd684

Thanks,

Michel

Comment by Ben Becker [ 09/Feb/12 ]

Thanks Ilya; will look into this issue soon.

Comment by Ilya Katsov [ 08/Feb/12 ]

Hi Ben,

Driver version is 2.7.2, mongod version is 2.0.2, 64-bit Linux host.

Comment by Ben Becker [ 07/Feb/12 ]

Hi Ilya,

Can you confirm the version of mongod and the java driver are you using? Also, is this a 64-bit Linux host?

Generated at Thu Feb 08 03:07:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.