[SERVER-23801] Server Crash after error Created: 19/Apr/16  Updated: 22/Apr/16  Resolved: 22/Apr/16

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alexandre [X] Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File errormongo    
Operating System: ALL
Participants:

 Description   

I don't know why my mongo server crash :

2016-04-19T16:16:01.884+0200 D WRITE    [conn7568] Caught WriteConflictException doing plan execution on myDB.myCollection, attempt: 18 retrying
2016-04-19T16:16:01.884+0200 D WRITE    [conn5684] Caught WriteConflictException doing plan execution on myDB.myCollection, attempt: 18 retrying
2016-04-19T16:16:01.886+0200 I CONTROL  [conn6294] 
 0x1315022 0x12b32f8 0x12a01dd 0x1095044 0x10939c0 0x107846b 0xcb2415 0xbf9b6a 0xc2e9b4 0xe2af85 0xe2b649 0xe2b745 0xba7ab9 0xba9be5 0xba9fad 0xbad0d8 0xbc5a23 0xbc6894 0xb228e0 0xcd1c05 0xcd4496 0x9b74ac 0x12c29dd 0x7f45ac25c182 0x7f45abf8947d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F15022","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"EB32F8","s":"_ZN5mongo10logContextEPKc"},{"b":"400000","o":"EA01DD","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"400000","o":"C95044","s":"_ZN5mongo17WiredTigerSession9getCursorERKSsmb"},{"b":"400000","o":"C939C0","s":"_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE"},{"b":"400000","o":"C7846B","s":"_ZNK5mongo21WiredTigerIndexUnique9newCursorEPNS_16OperationContextEb"},{"b":"400000","o":"8B2415","s":"_ZNK5mongo17IndexAccessMethod10findSingleEPNS_16OperationContextERKNS_7BSONObjE"},{"b":"400000","o":"7F9B6A","s":"_ZN5mongo11IDHackStage4workEPm"},{"b":"400000","o":"82E9B4","s":"_ZN5mongo11UpdateStage4workEPm"},{"b":"400000","o":"A2AF85","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"400000","o":"A2B649","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"400000","o":"A2B745","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"400000","o":"7A7AB9","s":"_ZN5mongo18WriteBatchExecutor10execUpdateERKNS_12BatchItemRefEPNS_7BSONObjEPPNS_16WriteErrorDetailE"},{"b":"400000","o":"7A9BE5","s":"_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestEPSt6vectorIPNS_19BatchedUpsertDetailESaIS6_EEPS4_IPNS_16WriteErrorDetailESaISB_EE"},{"b":"400000","o":"7A9FAD","s":"_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE"},{"b":"400000","o":"7AD0D8","s":"_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderE"},{"b":"400000","o":"7C5A23","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{"b":"400000","o":"7C6894","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"400000","o":"7228E0","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"400000","o":"8D1C05"},{"b":"400000","o":"8D4496","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"400000","o":"5B74AC","s":"_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE"},{"b":"400000","o":"EC29DD","s":"_ZN5mongo17PortMessageServer17handleIncomingMsgEPv"},{"b":"7F45AC254000","o":"8182"},{"b":"7F45ABE8F000","o":"FA47D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.5", "gitVersion" : "34e65e5383f7ea1726332cb175b73077ec4a1b02", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-74-generic", "version" : "#118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "8BD0E2ADD4592C91BBADCA1EEBC2B002DF5555A6" }, { "b" : "7FFF0F7F7000", "elfType" : 3, "buildId" : "DC075B751E9FB361F14CD59BD81300A6BB5CB377" }, { "b" : "7F45AD176000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "E21720F2804EF30440F2B39CD409252C26F58F73" }, { "b" : "7F45ACD9A000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "9BC22F9457E3D7E9CF8DDC135C0DAC8F7742135D" }, { "b" : "7F45ACB92000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F45AC98E000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F45AC688000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F45AC472000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F45AC254000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F45ABE8F000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F45AD3D5000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1315022]
 mongod(_ZN5mongo10logContextEPKc+0x138) [0x12b32f8]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xAD) [0x12a01dd]
 mongod(_ZN5mongo17WiredTigerSession9getCursorERKSsmb+0xE4) [0x1095044]
 mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x50) [0x10939c0]
 mongod(_ZNK5mongo21WiredTigerIndexUnique9newCursorEPNS_16OperationContextEb+0x15B) [0x107846b]
 mongod(_ZNK5mongo17IndexAccessMethod10findSingleEPNS_16OperationContextERKNS_7BSONObjE+0x25) [0xcb2415]
 mongod(_ZN5mongo11IDHackStage4workEPm+0x11A) [0xbf9b6a]
 mongod(_ZN5mongo11UpdateStage4workEPm+0x394) [0xc2e9b4]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x275) [0xe2af85]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x39) [0xe2b649]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x55) [0xe2b745]
 mongod(_ZN5mongo18WriteBatchExecutor10execUpdateERKNS_12BatchItemRefEPNS_7BSONObjEPPNS_16WriteErrorDetailE+0x6F9) [0xba7ab9]
 mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestEPSt6vectorIPNS_19BatchedUpsertDetailESaIS6_EEPS4_IPNS_16WriteErrorDetailESaISB_EE+0x2B5) [0xba9be5]
 mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x1DD) [0xba9fad]
 mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderE+0x248) [0xbad0d8]
 mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x463) [0xbc5a23]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0x404) [0xbc6894]
 mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x1F0) [0xb228e0]
 mongod(+0x8D1C05) [0xcd1c05]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x696) [0xcd4496]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0xEC) [0x9b74ac]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x26D) [0x12c29dd]
 libpthread.so.0(+0x8182) [0x7f45ac25c182]
 libc.so.6(clone+0x6D) [0x7f45abf8947d]
-----  END BACKTRACE  -----
2016-04-19T16:16:01.886+0200 I -        [conn6294] 
 
***aborting after invariant() failure
 
 
2016-04-19T16:36:11.852+0200 I CONTROL  [main] ***** SERVER RESTARTED *****



 Comments   
Comment by Ramon Fernandez Marina [ 22/Apr/16 ]

Ange7, WiredTiger needs at least two files per collection (one for the collection data and one for the _id index), plus one file per additional index in a collection. If your total count of collections and indexes is large you'll need to adjust your open files limit accordingly.

I see it's now set to 500000 above, which should be sufficient for most deployments. Please make sure that you set a high value for this limit at the system level. I'm going to close this ticket, but if after increasing this limit the problem persists please let us know so we can investigate further.

Thanks,
Ramón.

Comment by Alexandre [X] [ 20/Apr/16 ]

Hi Thomas,

In root (i think it's the user that runs MongoDb) :

ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1030501
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 500000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1030501
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Comment by Kelsey Schubert [ 20/Apr/16 ]

Hi Ange7,

Can you please provide output of ulimit -a as the user that runs MongoDB?

Thank you,
Thomas

Comment by Alexandre [X] [ 20/Apr/16 ]

Hey Ramon.

I'm using MongoDB 3.2.5 on Ubuntu Server LTS 14.04

I uploaded my log file

and it's a repeating incident, i don't why my mongo server crash some times... (2-3 times / day) and i think i founded :

« 2016-04-19T19:23:10.166+0200 I - [conn97] Invariant failure: ret resulted in status UnknownError: 24: Too many open files at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 74 »

Thank you

Comment by Ramon Fernandez Marina [ 19/Apr/16 ]

Ange7, can you please upload the full logs from the last restart until you get the invariant failure above? Also, what version of MongoDB is this happening on? Is this an isolated incident or is it repeating?

Thanks,
Ramón.

Generated at Thu Feb 08 04:04:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.