[SERVER-46271] Crash MongoDB: Invalid access at address Created: 20/Feb/20  Updated: 27/Oct/23  Resolved: 16/Mar/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mikhael Grefon Assignee: Ian Boros
Resolution: Gone away Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File logs.txt     Text File mongod.log    
Operating System: ALL
Sprint: Query 2020-03-23
Participants:

 Description   

MongoDB several times shutdown.

Logs:

2020-02-20T13:56:02.037+0200 I NETWORK [conn2391] end connection 127.0.0.1:42636 (9 connections now open)
2020-02-20T13:56:02.037+0200 I NETWORK [conn2392] end connection 127.0.0.1:42640 (8 connections now open)
2020-02-20T13:56:10.097+0200 F - [conn4] Invalid access at address: 0
2020-02-20T13:56:10.221+0200 F - [conn4] Got signal: 11 (Segmentation fault).
 0x5565aff8d131 0x5565aff8c92e 0x5565aff8cb0c 0x7f32e72a55f0 0x5565afc8c3d4 0x5565afc49106 0x5565afb6589e 0x5565afb65bf2 0x5565afb68d94 0x5565afb601a1 0x5565af8ecb8b 0x5565af8fbc4d 0x5565af8a4a76 0x5565af8a583c 0x5565af8ecb38 0x5565af8fe5dd 0x5565aee95ff5 0x5565aee95e46 0x5565aee965a8 0x5565aeede190 0x5565aeede91d 0x5565aebe56fc 0x5565aebea2af 0x5565aebdd6e5 0x5565ae90143a 0x5565ae903414 0x5565ae90417a 0x5565ae8f248c 0x5565ae8fe32c 0x5565ae8f9b4f 0x5565ae8fcf2c 0x5565af6e4f42 0x5565ae8f74ad 0x5565ae8fa803 0x5565ae8f8b77 0x5565ae8f9aab 0x5565ae8fcf2c 0x5565af6e53ab 0x5565afd1fb04 0x7f32e729de65 0x7f32e6fc688d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"5565AD7EF000","o":"279E131","s":"_ZN5mongo15printStackTraceERSo"},{"b":"5565AD7EF000","o":"279D92E"},{"b":"5565AD7EF000","o":"279DB0C"},{"b":"7F32E7296000","o":"F5F0"},{"b":"5565AD7EF000","o":"249D3D4","s":"_ZN5mongo9Variables8setValueElRKNS_5ValueEb"},{"b":"5565AD7EF000","o":"245A106","s":"_ZNK5mongo16ExpressionReduce8evaluateERKNS_8DocumentEPNS_9VariablesE"},{"b":"5565AD7EF000","o":"237689E","s":"_ZNK5mongo29parsed_aggregation_projection14ProjectionNode16applyExpressionsERKNS_8DocumentEPNS_15MutableDocumentE"},{"b":"5565AD7EF000","o":"2376BF2","s":"_ZNK5mongo29parsed_aggregation_projection14ProjectionNode15applyToDocumentERKNS_8DocumentE"},{"b":"5565AD7EF000","o":"2379D94","s":"_ZNK5mongo29parsed_aggregation_projection25ParsedInclusionProjection15applyProjectionERKNS_8DocumentE"},{"b":"5565AD7EF000","o":"23711A1","s":"_ZN5mongo29parsed_aggregation_projection27ParsedAggregationProjection19applyTransformationERKNS_8DocumentE"},{"b":"5565AD7EF000","o":"20FDB8B","s":"_ZN5mongo42DocumentSourceSingleDocumentTransformation7getNextEv"},{"b":"5565AD7EF000","o":"210CC4D","s":"_ZN5mongo20DocumentSourceUnwind7getNextEv"},{"b":"5565AD7EF000","o":"20B5A76","s":"_ZN5mongo19DocumentSourceGroup10initializeEv"},{"b":"5565AD7EF000","o":"20B683C","s":"_ZN5mongo19DocumentSourceGroup7getNextEv"},{"b":"5565AD7EF000","o":"20FDB38","s":"_ZN5mongo42DocumentSourceSingleDocumentTransformation7getNextEv"},{"b":"5565AD7EF000","o":"210F5DD","s":"_ZN5mongo8Pipeline7getNextEv"},{"b":"5565AD7EF000","o":"16A6FF5","s":"_ZN5mongo18PipelineProxyStage11getNextBsonEv"},{"b":"5565AD7EF000","o":"16A6E46","s":"_ZN5mongo18PipelineProxyStage6doWorkEPm"},{"b":"5565AD7EF000","o":"16A75A8","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"5565AD7EF000","o":"16EF190","s":"_ZN5mongo16PlanExecutorImpl12_getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"5565AD7EF000","o":"16EF91D","s":"_ZN5mongo16PlanExecutorImpl7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"5565AD7EF000","o":"13F66FC"},{"b":"5565AD7EF000","o":"13FB2AF","s":"_ZN5mongo12runAggregateEPNS_16OperationContextERKNS_15NamespaceStringERKNS_18AggregationRequestERKNS_7BSONObjERKSt6vectorINS_9PrivilegeESaISC_EEPNS_3rpc21ReplyBuilderInterfaceE"},{"b":"5565AD7EF000","o":"13EE6E5"},{"b":"5565AD7EF000","o":"111243A"},{"b":"5565AD7EF000","o":"1114414"},{"b":"5565AD7EF000","o":"111517A","s":"_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE"},{"b":"5565AD7EF000","o":"110348C","s":"_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE"},{"b":"5565AD7EF000","o":"110F32C","s":"_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE"},{"b":"5565AD7EF000","o":"110AB4F","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"5565AD7EF000","o":"110DF2C"},{"b":"5565AD7EF000","o":"1EF5F42","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE"},{"b":"5565AD7EF000","o":"11084AD","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE"},{"b":"5565AD7EF000","o":"110B803","s":"_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE"},{"b":"5565AD7EF000","o":"1109B77","s":"_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE"},{"b":"5565AD7EF000","o":"110AAAB","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"5565AD7EF000","o":"110DF2C"},{"b":"5565AD7EF000","o":"1EF63AB"},{"b":"5565AD7EF000","o":"2530B04"},{"b":"7F32E7296000","o":"7E65"},{"b":"7F32E6EC8000","o":"FE88D","s":"clone"}],"processInfo":{ "mongodbVersion" : "4.2.1", "gitVersion" : "edf6d45851c0b9ee15548f0f847df141764a317e", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-1062.1.1.el7.x86_64", "version" : "#1 SMP Fri Sep 13 22:55:44 UTC 2019", "machine" : "x86_64" }, "somap" : [ { "b" : "5565AD7EF000", "elfType" : 3, "buildId" : "ADA4E6CE7EF8A4213534845BA659B6F1A4EC41FB" }, { "b" : "7FFD976E0000", "elfType" : 3, "buildId" : "428DF2A55C2C8D97F4794CB3074D9A542CDC5C15" }, { "b" : "7F32E86C4000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "7C71A471444AD18F73AFAEA3EB42431A6DA96534" }, { "b" : "7F32E84AB000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "3009B26B33156EAAF99787AA3DA0C6AE99649755" }, { "b" : "7F32E8048000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "4CF1939F660008CFA869D8364651F31AACD2C1C4" }, { "b" : "7F32E7DD6000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "3B305C3BA17FE394862E749763F2956C9C890C2E" }, { "b" : "7F32E7BD2000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "18113E6E83D8E981B8E8D808F7F3DBB23F950A1D" }, { "b" : "7F32E79CA000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "4749697BF078337576C4629F0D30B296A0939779" }, { "b" : "7F32E76C8000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "5681C054FDABCF789F4DDA66E94F1F6ED1747327" }, { "b" : "7F32E74B2000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "DAC0179F4555AEFEC9E97476201802FD20C03EC5" }, { "b" : "7F32E7296000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "8B33F7F8C86F8D544C63C5541A8E42B3DDFEF8B1" }, { "b" : "7F32E6EC8000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "398944D32CF16A67AF51067A326E6C0CC14F90ED" }, { "b" : "7F32E892E000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5CC1A53B747A7E4D21198723C2B633E54F3C06D9" }, { "b" : "7F32E6C95000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "2B77BBEFFF65E94F3E0B71A4E89BEB68C4B476C5" }, { "b" : "7F32E6A68000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "1AF123CADB2F2910E89CBD540A06D3B33692F95E" }, { "b" : "7F32E680F000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "B6321C434B5C7386B144B925CEE2798D269FDDF5" }, { "b" : "7F32E65E7000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "BDA454441F59F41D2DA36E13CEA1FC4CE95B2BBB" }, { "b" : "7F32E62B8000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "D61EB90C9F32CA6E81E7FAC437F2C496438C8D9E" }, { "b" : "7F32E6088000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "1E366A2153AD7488EE72E989D9AD6BD458BE8EDE" }, { "b" : "7F32E5E84000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "325B8CE57A776DE0B24B362A7E0C90E903B1A4B8" }, { "b" : "7F32E5C7F000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "0460FF10A3C63749113D380C40E10DFCF066C76E" }, { "b" : "7F32E5A41000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "8840B019EDB66B0CFBD2F77EF196440F7928106E" }, { "b" : "7F32E57F4000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "E2AA8CA3D3164E7DBEC293BFA0B55D2B10DAC05D" }, { "b" : "7F32E550B000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "3EE7267AF7BFD3B132E6A222D997DA09C96C90DD" }, { "b" : "7F32E52D8000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "82E28CACB60C27CD6F14A6D2268F0CFF621664D0" }, { "b" : "7F32E50D4000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "67E935BFABA2C914C01156B88947DD515EA51170" }, { "b" : "7F32E4EC5000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "3192C56CD451E18EB9F29CB045432BA9C738DD29" }, { "b" : "7F32E4C70000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "F1FADDDE0D21D5F4E2DCADEDD3B85B6E7AAC9883" }, { "b" : "7F32E4A5A000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "B9D5F73428BD6AD68C96986B57BEA3B7CEDB9745" }, { "b" : "7F32E484A000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "4F5FBB2087BE132892467C4E7A46A3D07E5DA40B" }, { "b" : "7F32E4646000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F32E4429000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "E2F2017F821DD1B9D307DA1A9B8014F2941AEB7B" }, { "b" : "7F32E4202000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "D2DD4DA3FDE1477D25BFFF80F3A25FDB541A8179" }, { "b" : "7F32E3FCB000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "84467C988F41D853C58353BEB247670E15DA8BAD" }, { "b" : "7F32E3D69000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "9CA3D11F018BEEB719CDB34BE800BF1641350D0A" }, { "b" : "7F32E3B66000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "197680DAE6538245CB99723E57447C4EF2E98362" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5565aff8d131]
 mongod(+0x279D92E) [0x5565aff8c92e]
 mongod(+0x279DB0C) [0x5565aff8cb0c]
 libpthread.so.0(+0xF5F0) [0x7f32e72a55f0]
 mongod(_ZN5mongo9Variables8setValueElRKNS_5ValueEb+0x1A4) [0x5565afc8c3d4]
 mongod(_ZNK5mongo16ExpressionReduce8evaluateERKNS_8DocumentEPNS_9VariablesE+0x116) [0x5565afc49106]
 mongod(_ZNK5mongo29parsed_aggregation_projection14ProjectionNode16applyExpressionsERKNS_8DocumentEPNS_15MutableDocumentE+0x39E) [0x5565afb6589e]
 mongod(_ZNK5mongo29parsed_aggregation_projection14ProjectionNode15applyToDocumentERKNS_8DocumentE+0x62) [0x5565afb65bf2]
 mongod(_ZNK5mongo29parsed_aggregation_projection25ParsedInclusionProjection15applyProjectionERKNS_8DocumentE+0x24) [0x5565afb68d94]
 mongod(_ZN5mongo29parsed_aggregation_projection27ParsedAggregationProjection19applyTransformationERKNS_8DocumentE+0x21) [0x5565afb601a1]
 mongod(_ZN5mongo42DocumentSourceSingleDocumentTransformation7getNextEv+0x8B) [0x5565af8ecb8b]
 mongod(_ZN5mongo20DocumentSourceUnwind7getNextEv+0xED) [0x5565af8fbc4d]
 mongod(_ZN5mongo19DocumentSourceGroup10initializeEv+0x426) [0x5565af8a4a76]
 mongod(_ZN5mongo19DocumentSourceGroup7getNextEv+0xBC) [0x5565af8a583c]
 mongod(_ZN5mongo42DocumentSourceSingleDocumentTransformation7getNextEv+0x38) [0x5565af8ecb38]
 mongod(_ZN5mongo8Pipeline7getNextEv+0x3D) [0x5565af8fe5dd]
 mongod(_ZN5mongo18PipelineProxyStage11getNextBsonEv+0x35) [0x5565aee95ff5]
 mongod(_ZN5mongo18PipelineProxyStage6doWorkEPm+0x46) [0x5565aee95e46]
 mongod(_ZN5mongo9PlanStage4workEPm+0x68) [0x5565aee965a8]
 mongod(_ZN5mongo16PlanExecutorImpl12_getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x230) [0x5565aeede190]
 mongod(_ZN5mongo16PlanExecutorImpl7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x4D) [0x5565aeede91d]
 mongod(+0x13F66FC) [0x5565aebe56fc]
 mongod(_ZN5mongo12runAggregateEPNS_16OperationContextERKNS_15NamespaceStringERKNS_18AggregationRequestERKNS_7BSONObjERKSt6vectorINS_9PrivilegeESaISC_EEPNS_3rpc21ReplyBuilderInterfaceE+0x277F) [0x5565aebea2af]
 mongod(+0x13EE6E5) [0x5565aebdd6e5]
 mongod(+0x111243A) [0x5565ae90143a]
 mongod(+0x1114414) [0x5565ae903414]
 mongod(_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE+0x41A) [0x5565ae90417a]
 mongod(_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x3C) [0x5565ae8f248c]
 mongod(_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE+0xEC) [0x5565ae8fe32c]
 mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x17F) [0x5565ae8f9b4f]
 mongod(+0x110DF2C) [0x5565ae8fcf2c]
 mongod(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE+0x182) [0x5565af6e4f42]
 mongod(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE+0x10D) [0x5565ae8f74ad]
 mongod(_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE+0x843) [0x5565ae8fa803]
 mongod(_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE+0x2E7) [0x5565ae8f8b77]
 mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0xDB) [0x5565ae8f9aab]
 mongod(+0x110DF2C) [0x5565ae8fcf2c]
 mongod(+0x1EF63AB) [0x5565af6e53ab]
 mongod(+0x2530B04) [0x5565afd1fb04]
 libpthread.so.0(+0x7E65) [0x7f32e729de65]
 libc.so.6(clone+0x6D) [0x7f32e6fc688d]
----- END BACKTRACE -----
 



 Comments   
Comment by Ian Boros [ 16/Mar/20 ]

I'm sorry for the delay, unfortunately there is overhead in sorting through/triaging/scheduling these types of tickets. If you encounter problems like this again I encourage you to report them. The more data points we have, the easier it is to figure these problems out.

IB

Comment by Mikhael Grefon [ 15/Mar/20 ]

Can you elaborate on what you mean by "repair didn't run"?

I dont run "mongod --repair", only "systemctl start mongod.service".

 

3 weeks have passed since the topic was opened until your reply. I have a working project and he could not wait so long! A few days after the topic was opened, I did not receive advice on how to fix the error and therefore began to experiment. After updating Mongo to the latest version and setting up replication to another server, the problem did not occur.

Comment by Ian Boros [ 13/Mar/20 ]

mikhael@grefon.com A few questions:
When you say

"repair" did`t run. MongoDB started, data saved.

Can you elaborate on what you mean by "repair didn't run"? Are you saying mongod started and shutdown cleanly?

I'm also curious exactly what pipelines are running. The stack traces do give a clue, but it'd be good to have the exact query. Can you set the log verbosity of the "commands" component to 2? This should just involve changing the mongod conf file to have something like:

systemLog:
   component:
      command:
         verbosity: 2

The fact that this crash is happening in so many different places makes me a bit suspicious there's an environmental issue. Have you seen this problem in different environments/configurations?

Lastly, I will say that the more data points we have, the easy it will be to figure out what is going on. If you've seen more crashes like this recently, can you post the logs/stack traces?

Comment by Carl Champain (Inactive) [ 24/Feb/20 ]

Hi mikhael@grefon.com,

This issue appears to be a bug. We're passing this ticket along to the appropriate team. Updates will be posted on this ticket as they happen.

Thank you!
Carl

 

Comment by Mikhael Grefon [ 21/Feb/20 ]

2020-02-09T11: 21:30 - read checksum error for 49152B block at offset 19829555

After that crash i run "mongod --repair". MongoDB started, but all data collection-0-6606871978163036226.wt was lost!

 

2020-02-19T12: 00: 55 - Invalid access at address: 0 / Got signal: 11 (Segmentation fault).

"repair" did`t run. MongoDB started, data saved.

 

2020-02-19T17:53:33 - Invalid access at address: 0 / Got signal: 11 (Segmentation fault).

"repair" did`t run. MongoDB started, data saved.

 

2020-02-20T13:56:10 - Invalid access at address: 0 / Got signal: 11 (Segmentation fault).

"repair" did`t run. MongoDB started, data saved.

 

I attach logs: mongod.log

Wondering why this happened and how to avoid it?

 

Comment by Carl Champain (Inactive) [ 21/Feb/20 ]

Hi mikhael@grefon.com,

Can you please try to run mongod --repair?

In the event that the --repair operation is unsuccessful, please provide:

  • The logs of the repair operation.
  • The logs of any attempt to start mongod after the repair operation completed.

Thank you,
Carl

Generated at Thu Feb 08 05:10:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.