[SERVER-19754] mongod aborts on memory allocation failure after upgrade from 2.6.6 to 3.0.5 Created: 04/Aug/15  Updated: 07/Apr/23  Resolved: 25/Aug/15

Status: Closed
Project: Core Server
Component/s: Internal Code, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Deepak Shivamurthy Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: RF, WTmem
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File nmonCPU.png    
Operating System: Linux
Participants:

 Description   

Hi Team,

Recently I have updated mongodb from 2.6.6 to 3.0.5 (also moved from default storage to wiredtiger). I have not experience any issues till now, there was a lot of improvement in performance.

But today one of the replica set was crashed with below errors (taken from log) when doing write operations (which creates temporary collections dynamically). I am not sure whether its a issue with wiredtiger.

2015-08-04T12:45:27.639+0000 F -        [conn1008] Got signal: 6 (Aborted).
 
0xf5ba59 0xf5b322 0xf5b6d6 0x7f1936ca0ff0 0x7f1936ca0f79 0x7f1936ca4388 0xd9d7f9 0x8fa7c2 0x8fb321 0x8d17b7 0x8ef4a7 0x8f0f66 0x9d5f64 0x9d6eed 0x9d7bfb 0xb9c776 0xa  b2b50 0x80e06d 0xf0e9fb 0x7f193829c182 0x7f1936d6530d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B5BA59"},{"b":"400000","o":"B5B322"},{"b":"400000","o":"B5B6D6"},{"b":"7F1936C6A000","o":"36FF0"},{"b":"7F1936C6A000","o":"36F79"},{"  b":"7F1936C6A000","o":"3A388"},{"b":"400000","o":"99D7F9"},{"b":"400000","o":"4FA7C2"},{"b":"400000","o":"4FB321"},{"b":"400000","o":"4D17B7"},{"b":"400000","o":"4EF4  A7"},{"b":"400000","o":"4F0F66"},{"b":"400000","o":"5D5F64"},{"b":"400000","o":"5D6EED"},{"b":"400000","o":"5D7BFB"},{"b":"400000","o":"79C776"},{"b":"400000","o":"6B  2B50"},{"b":"400000","o":"40E06D"},{"b":"400000","o":"B0E9FB"},{"b":"7F1938294000","o":"8182"},{"b":"7F1936C6A000","o":"FB30D"}],"processInfo":{ "mongodbVersion" : "3  .0.5", "gitVersion" : "8bc4ae20708dbb493cb09338d9e7be6698e4a3a3", "uname" : { "sysname" : "Linux", "release" : "3.13.0-29-generic", "version" : "#53-Ubuntu SMP Wed Ju  n 4 21:00:20 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "695FC6828398A9DB1F99718671147885B5ED116D" }, { "b" : "7FFFEC  0FE000", "elfType" : 3, "buildId" : "3D068D088E7EAC15D9DA7C3AC912E783C0897EE7" }, { "b" : "7F1938294000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType"   : 3, "buildId" : "FE662C4D7B14EE804E0C1902FB55218A106BC5CB" }, { "b" : "7F1938036000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "6  C7AE380840DB9034D7763771B55E51B31BCAF14" }, { "b" : "7F1937C5C000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "3D522D8E04F5FD790  4AE69B50CA8835A71024490" }, { "b" : "7F1937A54000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB"   }, { "b" : "7F1937850000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F193754C00  0", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "19EFDDAB11B3BF5C71570078C59F91CF6592CE9E" }, { "b" : "7F1937246000", "path" : "/l  ib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "574C6350381DA194C00FF555E0C1784618C05569" }, { "b" : "7F1937030000", "path" : "/lib/x86_64-linux-gnu/libgc  c_s.so.1", "elfType" : 3, "buildId" : "CC0D578C2E0D86237CA7B0CE8913261C506A629A" }, { "b" : "7F1936C6A000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3,   "buildId" : "B571F83A8A6F5BB22D3558CDDDA9F943A2A67FD1" }, { "b" : "7F19384B2000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3  AEA35995A0C50D24D59A01D47" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5ba59]
mongod(+0xB5B322) [0xf5b322]
mongod(+0xB5B6D6) [0xf5b6d6]
libc.so.6(+0x36FF0) [0x7f1936ca0ff0]
libc.so.6(gsignal+0x39) [0x7f1936ca0f79]
libc.so.6(abort+0x148) [0x7f1936ca4388]
mongod(_ZN5mongo12SecureRandom6createEv+0x1B9) [0xd9d7f9]
mongod(_ZN5mongo31SaslSCRAMSHA1ServerConversation10_firstStepERSt6vectorISsSaISsEEPSs+0x16F2) [0x8fa7c2]
mongod(_ZN5mongo31SaslSCRAMSHA1ServerConversation4stepERKNS_10StringDataEPSs+0x2F1) [0x8fb321]
mongod(_ZN5mongo31NativeSaslAuthenticationSession4stepERKNS_10StringDataEPSs+0x27) [0x8d17b7]
mongod(+0x4EF4A7) [0x8ef4a7]
mongod(+0x4F0F66) [0x8f0f66]
mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9d5f64]
mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC1D) [0x9d6eed]
mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9d7bfb]
mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x746) [0xb9c776]
mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xab2b50]
mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x80e06d]
mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf0e9fb]
libpthread.so.0(+0x8182) [0x7f193829c182]
libc.so.6(clone+0x6D) [0x7f1936d6530d]
-----  END BACKTRACE  -----



 Comments   
Comment by Ramon Fernandez Marina [ 24/Nov/15 ]

ych.tiger@gmail.com, please take a look at this comment on SERVER-21323, as this error is most likely related to low limits.

If after increasing the limits for the number of open files the issue persists please open a new ticket.

Thanks,
Ramón.

Comment by YANG Chenghu [ 24/Nov/15 ]

3.0.6 also crashed

----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"D9303B"},{"b":"400000","o":"D92612"},{"b":"400000","o":"D92A06"},{"b":"7F218730F000","o":"32920"},{"b":"7F218730F000","o":"328A5"},{"b":"7F218730F000","o":"34085"},{"b":"400000","o":"B7FFA9"},{"b":"400000","o":"5258CE"},{"b":"400000","o":"5265F2"},{"b":"400000","o":"4F1976"},{"b":"400000","o":"51774C"},{"b":"400000","o":"519B23"},{"b":"400000","o":"69421C"},{"b":"400000","o":"69539E"},{"b":"400000","o":"695F80"},{"b":"400000","o":"8EDC8F"},{"b":"400000","o":"7BEF0A"},{"b":"400000","o":"40A56A"},{"b":"400000","o":"D3D0B3"},{"b":"7F2188382000","o":"7851"},{"b":"7F218730F000","o":"E767D"}],"processInfo":{ "mongodbVersion" : "3.0.6", "gitVersion" : "nogitversion", "uname" : { "sysname" : "Linux", "release" : "2.6.32-220.23.2.ali878.el6.x86_64", "version" : "#1 SMP Mon Jan 28 17:12:52 CST 2013", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF3B5EB000", "elfType" : 3, "buildId" : "505D1DB16903CBB2ECAC8DD8137C641A63080C97" }, { "b" : "7F2188BB7000", "path" : "/opt/aegis/lib64/aegis_monitor_connect.so", "elfType" : 3, "buildId" : "EC94FE933E5D536402786CB3880478EA08D86C27" }, { "b" : "7F21889B3000", "path" : "/opt/aegis/lib64/aegis_monitor_exec.so", "elfType" : 3, "buildId" : "827E53DF33C35E7678179FBE0C4CBEB130B53BC8" }, { "b" : "7F21887B1000", "path" : "/opt/aegis/lib64/aegis_monitor_dns.so", "elfType" : 3, "buildId" : "CC376E884DF4CC9161E72387D01326C6F45A5148" }, { "b" : "7F21885AF000", "path" : "/opt/aegis/lib64/aegis_monitor_kill.so", "elfType" : 3, "buildId" : "29370547A889A63241BE47861A62EB88D852153D" }, { "b" : "7F2188382000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "48A9F8600F0A15F6418EDE25846C324EC8891DD4" }, { "b" : "7F2188116000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "9B484BE2BA6DAE22B9211C59E445D328FA2CA5C1" }, { "b" : "7F2187D33000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "58B14478BCA1E4EDBA9EFC82721399DB6DF8434C" }, { "b" : "7F2187B2A000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "1574F9C2DDAECEE537C45143BB79E8D61BED98FE" }, { "b" : "7F2187926000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "0B4FE52FE93C0B9894775AFDD53E2DF9D3C2839A" }, { "b" : "7F21876A2000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "BF14593D7915402AA62C2573FCCDB252AEEBF754" }, { "b" : "7F218730F000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "3AC348A69F62BFC2280DA1A8188173961BB2E9BE" }, { "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "42AEFAFC23375DC250C49C420C37EDC4515B9C02" }, { "b" : "7F21870CD000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "87F1C23045216178FEA1723D80392CE25253F5E3" }, { "b" : "7F2186DEE000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "AC50B0607A56BFDB0F28B2059ADBC6C14E661281" }, { "b" : "7F2186BE9000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "4623A78918C882770E81AE7B5EE9DDF8DD2B6674" }, { "b" : "7F21869BD000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "355D765A262E1D19A84D4B2707B8800BDE875AED" }, { "b" : "7F21867A7000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7F218659B000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "DFD32F0808469788E66BF4C97FA12E8EEFEE70F9" }, { "b" : "7F2186398000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7F218617E000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "8217E68E5C9D964CDF500F488B2A183F870F36B2" }, { "b" : "7F2185F5E000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "C6D22E92109645646945630DE92507C2BB264E8F" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x2B) [0x119303b]
 mongod(+0xD92612) [0x1192612]
 mongod(+0xD92A06) [0x1192a06]
 libc.so.6(+0x32920) [0x7f2187341920]
 libc.so.6(gsignal+0x35) [0x7f21873418a5]
 libc.so.6(abort+0x175) [0x7f2187343085]
 mongod(_ZN5mongo12SecureRandom6createEv+0x1B9) [0xf7ffa9]
 mongod(_ZN5mongo31SaslSCRAMSHA1ServerConversation10_firstStepERSt6vectorISsSaISsEEPSs+0x1A4E) [0x9258ce]
 mongod(_ZN5mongo31SaslSCRAMSHA1ServerConversation4stepERKNS_10StringDataEPSs+0x3A2) [0x9265f2]
 mongod(_ZN5mongo31NativeSaslAuthenticationSession4stepERKNS_10StringDataEPSs+0x26) [0x8f1976]
 mongod(+0x51774C) [0x91774c]
 mongod(+0x519B23) [0x919b23]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x2C) [0xa9421c]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xEFE) [0xa9539e]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x2F0) [0xa95f80]
 mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x20DF) [0xcedc8f]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xADA) [0xbbef0a]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDA) [0x80a56a]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x2F3) [0x113d0b3]
 libpthread.so.0(+0x7851) [0x7f2188389851]
 libc.so.6(clone+0x6D) [0x7f21873f667d]

Comment by Deepak Shivamurthy [ 12/Aug/15 ]

I have upgraded to 3.0.6-rc0 (from 3.0.5), mongo services are not crashing now. I think now it seems to be resolved. I will let you know if I face any issues.

Thanks for all your help, Ramon!!

Comment by Ramon Fernandez Marina [ 12/Aug/15 ]

deepak.shivamurth, I forgot to add that a binary replacement should be sufficient, but please take a look at the full documentation on upgrading to the latest revision of MongoDB, as it contains detailed information about sharded clusters, etc.

Thanks,
Ramón.

Comment by Deepak Shivamurthy [ 12/Aug/15 ]

thanks Ramon, I will have a look at it. I will update you if I find any issues.

Comment by Ramon Fernandez Marina [ 12/Aug/15 ]

deepak.shivamurth, this is to let you know that we've published the 3.0.6-rc0 release candidate containing a fix for SERVER-19673, which we believe is the underlying root cause for this issue.

If you're able to reproduce the problem, would you be able to test with 3.0.6-rc0 and let us know if the problem persists?

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 07/Aug/15 ]

deepak.shivamurth, if you're able to reproduce this problem (or if you've seen it more than once and expect it again) it would be very useful if you could collect server statistics that can help us find the root cause of the problem. To collect these statistics you'd need to run the following from a shell:

mongo --eval "while(true) {print(JSON.stringify(db.serverStatus({tcmalloc:1}))); sleep(10000)}" >ss.log &
iostat -k -t -x 10 >iostat.log &

These commands will create two files, iostat.log and ss.log, containing statistics collected at 10 second intervals. This interval will allow you to leave these commands running for a while until the problem reproduces.

If while collecting this data the problem doesn't reproduce it may still be useful for us to see the data: if it shows a pattern of increased memory usage that may allow us to pinpoint what part of the system may be responsible for the behavior you're seeing.

In addition to the iostat.log and ss.log files we'd need the server logs producing during statistics collection time. You can upload files to this ticket at any time, or if you perfer you can use this private upload portal.

Would it be possible for you to collect data about your server as described above?

Thanks,
Ramón.

Comment by Deepak Shivamurthy [ 05/Aug/15 ]

its a mongodb dedicated server and only one mongod instance is running.

Comment by Deepak Shivamurthy [ 05/Aug/15 ]

Hi Ramon,

Please find below vmstat output and also attached screenshot for CPU
utilization;

fyi, currently our mongod services are running with below configuration;

RAM - 32
Cores - 8

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
1  0      0 4789564 186828 27055368    0    0     1     4    0    1  0  0 100  0  0
0  0      0 4789564 186828 27055368    0    0     0     0  263  497  0  0 100  0  0
0  0      0 4789588 186828 27055368    0    0     0     0  309  546  1  0 99  0  0
0  0      0 4789588 186828 27055368    0    0     0     0  262  503  0  0 100  0  0
0  0      0 4789580 186828 27055368    0    0     0     0  293  539  0  0 100  0  0
0  0      0 4789644 186828 27055368    0    0     0     0  299  533  0  0 100  0  0
0  0      0 4789708 186832 27055364    0    0     0    12  276  517  0  0 100  0  0
0  0      0 4789516 186832 27055368    0    0     0     0  258  491  0  0 100  0  0
0  0      0 4789516 186832 27055368    0    0     0     0  298  520  1  0 99  0  0
0  0      0 4789548 186832 27055368    0    0     0     4  260  505  0  0 100  0  0
0  0      0 4789548 186832 27055368    0    0     0     0  265  508  0  0 100  0  0
0  0      0 4789548 186832 27055368    0    0     0    12  258  503  0  0 100  0  0
1  0      0 4789548 186836 27055364    0    0     0    12  258  494  0  0 100  0  0
3  1      0 4789312 186836 27055368    0    0     0    60  408  693  0  0 99  0  0
0  1      0 4776164 186836 27055484    0    0     0  1260 3600 5481  8  5 77 10  0
0  0      0 4905800 186836 27055528    0    0     0   560  642 1113  1  1 95  4  0
0  0      0 4906008 186836 27055528    0    0     0     0   37   12  0  0 100  0  0
0  0      0 4906008 186836 27055528    0    0     0     0   15   12  0  0 100  0  0
0  0      0 4906420 186840 27055572    0    0     0    12   30   33  0  0 100  0  0
0  0      0 4906564 186840 27055580    0    0     0     0   21   27  0  0 100  0  0
0  0      0 4906628 186840 27055580    0    0     0    12   18   25  0  0 100  0  0
 
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
1  0      0 4906628 186840 27055580    0    0     0     0   12   16  0  0 100  0  0
0  0      0 4906628 186840 27055580    0    0     0     0   21   35  0  0 100  0  0
0  0      0 4906660 186840 27055580    0    0     0     0   17   21  0  0 100  0  0
0  0      0 4906660 186840 27055580    0    0     0   116   21   26  0  0 100  0  0

Regards,
Deepak S

Comment by Ramon Fernandez Marina [ 04/Aug/15 ]

deepak.shivamurth, looks like mongod run out of memory. Can you please provide details on the memory configuration and usage on this node? Has the amount of memory available to mongod been limited in some way?

Thanks,
Ramón.

Generated at Thu Feb 08 03:51:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.