Loading...

Type: Question
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.11
Component/s: GridFS, MMAPv1
Labels:
None
Environment:
CentOS 6.7
Data stored on XFS
17TB disk space allocated, only 84% full
Server has 24GB ram

Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I'm having trouble deciphering the reason for this crash.

2016-05-28T02:48:01.609-0700 [FileAllocator] allocating new datafile /srv/db/mongo/data/loadable_store.6760, filling 
with zeroes...
2016-05-28T02:48:01.633-0700 [FileAllocator] done allocating datafile /srv/db/mongo/data/loadable_store.6760, size: 2
047MB,  took 0.023 secs
2016-05-28T02:48:01.636-0700 [conn56195] insert loadable_store.fs.chunks query: { files_id: ObjectId('574968e3ca6d39a3d5125868'), n: 20, data: BinData(0, C639B837E2C6FD1E1F2C2CF7B581106CF062C22F2F031566A47725A393588F36DA2D342159A5F601E2B449494D9C39E9F487E806BA11322E55A407121DD77A00E7F01CCA9B2C...), _id: ObjectId('574968e3ca6d39a3d512587d') } ninserted:1 keyUpdates:0 numYields:0 locks(micros) w:110316411 110316ms
2016-05-28T02:48:01.636-0700 [conn56195] command loadable_store.$cmd command: insert { insert: "fs.chunks", documents: [ { files_id: ObjectId('574968e3ca6d39a3d5125868'), n: 20, data: BinData(0, C639B837E2C6FD1E1F2C2CF7B581106CF062C22F2F031566A47725A393588F36DA2D342159A5F601E2B449494D9C39E9F487E806BA11322E55A407121DD77A00E7F01CCA9B2C...), _id: ObjectId('574968e3ca6d39a3d512587d') } ], ordered: true, writeConcern: { w: 1 } } keyUpdates:0 numYields:0 locks(micros) w:67 reslen:40 110316ms
2016-05-28T02:48:09.457-0700 [journal] warning assertion failure a <= 256*1024*1024 src/mongo/util/alignedbuilder.cpp 103
2016-05-28T02:48:09.556-0700 [journal] 0x121d451 0x11bcbb9 0x119f8fe 0x119f063 0xa794bf 0xa79774 0xa694d3 0xa6bc39 0xa6bf77 0x1262219 0x7f1898b00a51 0x7f1897ea593d 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x121d451]
 /usr/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x11bcbb9]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x16e) [0x119f8fe]
 /usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x119f063]
 /usr/bin/mongod() [0xa794bf]
 /usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderERNS_14AlignedBuilderE+0x214) [0xa79774]
 /usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0xa3) [0xa694d3]
 /usr/bin/mongod() [0xa6bc39]
 /usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x297) [0xa6bf77]
 /usr/bin/mongod() [0x1262219]
 /lib64/libpthread.so.0(+0x7a51) [0x7f1898b00a51]
 /lib64/libc.so.6(clone+0x6d) [0x7f1897ea593d]
2016-05-28T02:48:10.206-0700 [journal] rate limiting wassert
2016-05-28T02:48:13.558-0700 [journal] Assertion failure a <= 512*1024*1024 src/mongo/util/alignedbuilder.cpp 104
2016-05-28T02:48:13.574-0700 [journal] 0x121d451 0x11bcbb9 0x11a1124 0x119f07d 0xa794bf 0xa79774 0xa694d3 0xa6bc39 0xa6bf77 0x1262219 0x7f1898b00a51 0x7f1897ea593d 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x121d451]
 /usr/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x11bcbb9]
 /usr/bin/mongod(_ZN5mongo12verifyFailedEPKcS1_j+0x194) [0x11a1124]
 /usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x7d) [0x119f07d]
 /usr/bin/mongod() [0xa794bf]
 /usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderERNS_14AlignedBuilderE+0x214) [0xa79774]
 /usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0xa3) [0xa694d3]
 /usr/bin/mongod() [0xa6bc39]
 /usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x297) [0xa6bf77]
 /usr/bin/mongod() [0x1262219]
 /lib64/libpthread.so.0(+0x7a51) [0x7f1898b00a51]
 /lib64/libc.so.6(clone+0x6d) [0x7f1897ea593d]
2016-05-28T02:48:13.584-0700 [journal] dbexception in groupCommitLL causing immediate shutdown: 0 assertion src/mongo/util/alignedbuilder.cpp:104
2016-05-28T02:48:13.584-0700 [journal] SEVERE: dur1
2016-05-28T02:48:13.584-0700 [conn56192] remove loadable_store.fs.chunks query: { _id: ObjectId('574968e3ca6d39a3d512587e') } ndeleted:0 keyUpdates:0 numYields:0 locks(micros) w:69 11945ms
2016-05-28T02:48:13.584-0700 [conn56192] command loadable_store.$cmd command: delete { delete: "fs.chunks", deletes: [ { q: { _id: ObjectId('574968e3ca6d39a3d512587e') }, limit: 0 } ], ordered: true, writeConcern: { w: 1 } } keyUpdates:0 numYields:0  reslen:40 11945ms
2016-05-28T02:48:13.594-0700 [conn56221] update wf_file_storage.fs.chunks query: { _id: ObjectId('57496953671615d0d649e699') } update: { files_id: ObjectId('57496953671615d0d649e698'), n: 0, data: BinData(0, 1F8B0800000000000003ECB8075453DDB72F8A20A028161050A409D2114808094D308400A1871E3AA184164A802420458A2220D24191A280F42A20557A9122BD8A3401E9BDF7...), _id: ObjectId('57496953671615d0d649e699') } nscanned:0 nscannedObjects:0 nMatched:1 nModified:0 upsert:1 keyUpdates:0 numYields:0 locks(micros) w:9152 10019ms
2016-05-28T02:48:13.594-0700 [conn56221] command wf_file_storage.$cmd command: update { update: "fs.chunks", updates: [ { q: { _id: ObjectId('57496953671615d0d649e699') }, u: { files_id: ObjectId('57496953671615d0d649e698'), n: 0, data: BinData(0, 1F8B0800000000000003ECB8075453DDB72F8A20A028161050A409D2114808094D308400A1871E3AA184164A802420458A2220D24191A280F42A20557A9122BD8A3401E9BDF7...), _id: ObjectId('57496953671615d0d649e699') }, upsert: true, multi: false } ], ordered: true, writeConcern: { w: 1 } } keyUpdates:0 numYields:0  reslen:106 10019ms
2016-05-28T02:48:13.596-0700 [journal] SEVERE: Got signal: 6 (Aborted).
Backtrace:0x121d451 0x121c82e 0x7f1897def6a0 0x7f1897def625 0x7f1897df0e05 0xba749b 0xa69a16 0xa6bc39 0xa6bf77 0x1262219 0x7f1898b00a51 0x7f1897ea593d 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x121d451]
 /usr/bin/mongod() [0x121c82e]
 /lib64/libc.so.6(+0x326a0) [0x7f1897def6a0]
 /lib64/libc.so.6(gsignal+0x35) [0x7f1897def625]
 /lib64/libc.so.6(abort+0x175) [0x7f1897df0e05]
 /usr/bin/mongod(_ZN5mongo10mongoAbortEPKc+0x6b) [0xba749b]
 /usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0x5e6) [0xa69a16]
 /usr/bin/mongod() [0xa6bc39]
 /usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x297) [0xa6bf77]
 /usr/bin/mongod() [0x1262219]
 /lib64/libpthread.so.0(+0x7a51) [0x7f1898b00a51]
 /lib64/libc.so.6(clone+0x6d) [0x7f1897ea593d]
2016-05-28T03:04:58.207-0700 ***** SERVER RESTARTED *****

Doing a little searching, i see this error occur frequently due to the journal getting backed up, where the journal has too much data that is unflushed to the datafiles. I don't see that DR102 error in my log, so I'm curious why I got this error and what I could do to prevent it.

As far as my configuration, there is only 1 server, no replication, no sharding, that we use for storing large files using gridfs. The database size is somewhere in the 14TB range. We use node.js for the bulk of the code we write that communicates with the mongo server.

The events leading up to this crash are pretty simple. It crashed during a time where typically nothing is happening to the server, however, we have a build process that ended up sending 16 150MB files to be stored in mongo each about 30 seconds apart. I don't typically have issues with this kind of load, but on this occasion I did.

So what causes it? Memory allocation? Fragmentation? Is there something I can do to prevent it?

Details

Description

Attachments

Forms

Activity

People

Dates