[SERVER-18856] can not allocate memory Created: 07/Jun/15  Updated: 01/Jul/15  Resolved: 01/Jul/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: white_li Assignee: Ramon Fernandez Marina
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

hello

i am making persistent insertion operations to mongodb using wiredtiger engine, no constraint abount cache_size, using default 50% of system RAM
,abount 62GB

mongodb panic after running for 2 days ,average insertion speed of 20~30 docs per seconds, maybe some docs larger than 1MB, i get the following error information in logfile context:

2015-06-07T06:07:47.183Z I WRITE    [conn94] insert ipattern.dex_string ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } 141ms
2015-06-07T06:07:47.192Z I WRITE    [conn53] insert ipattern.dex_string ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } 190ms
2015-06-07T06:07:47.359Z I WRITE    [conn102] insert ipattern.dex_string ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } 134ms
2015-06-07T06:07:47.795Z E STORAGE  [conn63] WiredTiger (12) [1433657267:795181][23591:0x7fc6db74f700], file:local/collection/6-7285080005515704958.wt, cursor.insert: memory allocation: Cannot allocate memory
2015-06-07T06:07:47.809Z I -        [conn63] Assertion: 17322:write to oplog failed: UnknownError WiredTigerRecordStore::insertRecord 12: Cannot allocate memory
2015-06-07T06:07:47.853Z E STORAGE  [conn104] WiredTiger (12) [1433657267:853125][23591:0x7fc6d8e26700], file:ipattern/index/3--2932365438447056541.wt, cursor.insert: memory allocation: Cannot allocate memory
2015-06-07T06:07:47.858Z E STORAGE  [conn51] WiredTiger (12) [1433657267:858968][23591:0x7fc6dc35b700], file:ipattern/collection/2--2932365438447056541.wt, cursor.insert: memory allocation: Cannot allocate memory
2015-06-07T06:07:47.859Z E STORAGE  WiredTiger (12) [1433657267:859851][23591:0x7fc6fd4dc700], file:local/collection/6-7285080005515704958.wt, session.checkpoint: memory allocation: Cannot allocate memory
2015-06-07T06:07:47.859Z E STORAGE  WiredTiger (12) [1433657267:859912][23591:0x7fc6fd4dc700], file:local/collection/6-7285080005515704958.wt, session.checkpoint: session unable to allocate a scratch buffer: Cannot allocate memory
2015-06-07T06:07:47.860Z E STORAGE  WiredTiger (12) [1433657267:860094][23591:0x7fc6fd4dc700], checkpoint-server: checkpoint server error: Cannot allocate memory
2015-06-07T06:07:47.860Z E STORAGE  WiredTiger (-31804) [1433657267:860116][23591:0x7fc6fd4dc700], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-06-07T06:07:47.860Z I -        Fatal Assertion 28558
2015-06-07T06:07:47.874Z I -        [conn102] Fatal Assertion 28559
2015-06-07T06:07:47.878Z I -        [conn51] Fatal Assertion 28559
2015-06-07T06:07:47.878Z I -        [conn104] Fatal Assertion 28559
2015-06-07T06:07:47.879Z I -        [conn60] Fatal Assertion 28559
2015-06-07T06:07:47.882Z I -        [conn61] Fatal Assertion 28559
2015-06-07T06:07:47.885Z I -        [conn110] Fatal Assertion 28559
2015-06-07T06:07:47.899Z I -        [WiredTigerRecordStoreThread for local.oplog.rs] Fatal Assertion 28559
2015-06-07T06:07:47.903Z I -        [conn77] Fatal Assertion 28559
2015-06-07T06:07:47.906Z I -        [conn78] Fatal Assertion 28559
2015-06-07T06:07:47.908Z I CONTROL  
 0xf6a889 0xf08321 0xeec161 0xd88caa 0x139bc59 0x139be15 0x139c2b4 0x1331663 0x7fc709e60846 0x7fc7089a23cd
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B6A889"},{"b":"400000","o":"B08321"},{"b":"400000","o":"AEC161"},{"b":"400000","o":"988CAA"},{"b":"400000","o":"F9BC59"},{"b":"400000","o":"F9BE15"},{"b":"400000","o":"F9C2B4"},{"b":"400000","o":"F31663"},{"b":"7FC709E59000","o":"7846"},{"b":"7FC7088BB000","o":"E73CD"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.el6.x86_64", "version" : "#1 SMP Fri Nov 22 03:15:09 UTC 2013", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "108A63CA14A4BD5E599BAC10885DBD3A85DA5439" }, { "b" : "7FFFFBE26000", "elfType" : 3, "buildId" : "81A81BE2E44C93640ADEDB62ADC93A47F4A09DD1" }, { "b" : "7FC709E59000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "4E32EC9F935CEA5A74F7AFF8CDB6A92B823634EC" }, { "b" : "7FC709BED000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "DAF114120DA5C9DBEB1E5A704CE83ACB9B8B7B54" }, { "b" : "7FC70980A000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "F523EAC46D068A8E0869CF93BCD84B414937993A" }, { "b" : "7FC709602000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "FA33B7ED6CC07555C335C237AD2048AE7E756B3F" }, { "b" : "7FC7093FE000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "55DB1A52AF46A030846823B5BC73D4CD7E4AFB62" }, { "b" : "7FC7090F8000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "F07F2E7CF4BFB393CC9BBE8CDC6463652E14DB07" }, { "b" : "7FC708E73000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "CB0450E602989E3A4436577E2146F3A849E619C9" }, { "b" : "7FC708C5D000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "246C3BAB0AB093AFD59D34C8CBF29E786DE4BE97" }, { "b" : "7FC7088BB000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "CC5049764632DF2247ED7AB10F80D04673772834" }, { "b" : "7FC70A075000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "D5B026B9F73EEFDCBE957C6F6610CD37AB91F180" }, { "b" : "7FC708677000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "B7F7FF323B3A4A12310A6285412F01ACE8C74E47" }, { "b" : "7FC708391000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "7920917F74AFAD0B8CB197CABBE472AF39D94C34" }, { "b" : "7FC70818D000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "8CE28F280150E62296240E70ECAC64E4A57AB826" }, { "b" : "7FC707F61000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "05733977F4E41652B86070B27A0CFC2C1EA7719D" }, { "b" : "7FC707D4B000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7FC707B40000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "C8D01C2839F6950988CE32B4266A8F89C521ACB0" }, { "b" : "7FC70793D000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7FC707724000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "2F19213416DEA01BFBE2DDB807B5E4F52230B309" }, { "b" : "7FC707505000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "E6798A06BEE17CF102BBA44FD512FF8B805CEAF1" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf6a889]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf08321]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeec161]
 mongod(+0x988CAA) [0xd88caa]
 mongod(__wt_eventv+0x489) [0x139bc59]
 mongod(__wt_err+0x95) [0x139be15]
 mongod(__wt_panic+0x24) [0x139c2b4]
 mongod(+0xF31663) [0x1331663]
 libpthread.so.0(+0x7846) [0x7fc709e60846]
 libc.so.6(clone+0x6D) [0x7fc7089a23cd]
-----  END BACKTRACE  -----
2015-06-07T06:07:47.909Z I -        
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 15/Jun/15 ]

swordsmanli, have you been able to test the change to the cache size above?

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 08/Jun/15 ]

swordsmanli, you may be running into SERVER-17424. Can you please lower the cache size and see if the issue still persists?

It would also be helpful to monitor OS memory consumption when this happens to get a better understanding of the impact. The ideal scenario would for you to add this system to MMS for monitoring; alternatives include the use of OS tools like vmstat.

Thanks,
Ramón.

Comment by white_li [ 07/Jun/15 ]

configuration
storage:
dbPath: "/data/mongodb/db"
directoryPerDB: true
journal:
enabled: true
engine: "wiredTiger"
wiredTiger:
engineConfig:
journalCompressor: zlib
directoryForIndexes: true
collectionConfig:
blockCompressor: zlib
indexConfig:
prefixCompression: true
systemLog:
logRotate: rename
destination: file
path: "/var/log/mongodb/mongod.log"
logAppend: true
timeStampFormat: iso8601-utc
processManagement:
pidFilePath: "/var/run/mongodb/mongod.pid"
fork: true
net:
port: 27017
unixDomainSocket:
enabled : true
maxIncomingConnections: 10240
http:
enabled: false
RESTInterfaceEnabled: false
JSONPEnabled: false
replication:
replSetName: "RS1"
oplogSizeMB: 10240
security:
authorization: "enabled"
keyFile: "/data/keyfile"
starting log
I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=62G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=zlib),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

Generated at Thu Feb 08 03:48:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.