[SERVER-9235] Better messaging/logging when disks are inaccessible Created: 04/Apr/13  Updated: 10/Dec/14  Resolved: 04/Apr/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.2.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jason Zucchetto Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu/AWS/EBS


Issue Links:
Duplicate
duplicates SERVER-7591 MongoDB cannot write to disk, it shou... Closed
Related
Participants:

 Description   

We ran into a bad problem with EBS volumes becoming intermittently inaccessible. The following stacktrace occurs and MongoDB shuts down. Better messaging in the logs would help users diagnose this problem faster.

I believe this can be recreated by unmounting a drive while MongoDB is running:

dbexception in groupCommit causing immediate shutdown: 0 assertion src/mongo/util/alignedbuilder.cpp:91
 
Assertion failure a <= 512*1024*1024 src/mongo/util/alignedbuilder.cpp 91
0xb07561 0xacdb5d 0xacc36d 0x73dda5 0x73e0b4 0x731024 0x73157a 0x72ef8c 0x72f09c 0x9ac179 0x9ab9ef 0xadab5d 0xb4d3d9 0x7f99bbfa6d8c 0x7f99bb347fdd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xb07561]
/usr/bin/mongod(_ZN5mongo12verifyFailedEPKcS1_j+0xfd) [0xacdb5d]
/usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x7d) [0xacc36d]
/usr/bin/mongod() [0x73dda5]
/usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderERNS_14AlignedBuilderE+0x214) [0x73e0b4]
/usr/bin/mongod() [0x731024]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl9commitNowEv+0x1a) [0x73157a]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl16_aCommitIsNeededEv+0x14c) [0x72ef8c]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl14commitIfNeededEb+0x4c) [0x72f09c]
/usr/bin/mongod(_ZN5mongo7replset8SyncTail9syncApplyERKNS_7BSONObjEb+0x2a9) [0x9ac179]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x4f) [0x9ab9ef]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x26d) [0xadab5d]
/usr/bin/mongod() [0xb4d3d9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c)



 Comments   
Comment by Jonathan N [ 26/Sep/13 ]

The Bad News:
--------------------
I am having the same issue while doing a map reduce operation in a single node.
If I use a smaller data set, the operation finishes as expected. However, when the data set passes 2 Million records, the server crashes at the end of the map reduce operation.
Running on a VM with:
OS: CentOS release 6.4 (Final)
RAM: 8GB
Disk Space: 200GB
MongoDB Version: v2.4.6 (b9925db5eac369d77a3a5f5d98a145eaaacd9673)
Here is the output log: https://gist.github.com/jnonon/a0c7f6301b14f3ca4fea

The Good News:
----------------------
After setting the option nojournal=true on the configuration file, the map reduce task was able to be completed.

The Question:
------------------
Is there any option to fix this issue without disabling the journal file? Thanks in advance for your help.

Generated at Thu Feb 08 03:19:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.