[SERVER-44575] mongod crashes during initial replication Created: 12/Nov/19  Updated: 29/Oct/23  Resolved: 18/Nov/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.2.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dmytro Bogdanov Assignee: Dmitry Agranat
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-42484 May not be inside required WriteUnitO... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Add member to replica set by connecting to primary, configure replica set in mongod.conf of new member, start mongod.

Participants:

 Description   

Added secondary member to replica set. Databases are quite big - several TB. mongod started replicating the data and crashed in few minutes. This behavior is consistent, though it would crash after replicating different amount of data each time. Here is what is in the log. I have tried google search for "Invariant failure _inUnitOfWork() ActiveNotInUnitOfWork" but found nothing, so this is either new or very specific issue. Using 4.2.0 on both existing members and new one. There is enough available storage space, machine has 256GB of RAM. This is RedHat 7.7.

 

 
2019-11-12T09:08:16.116+0000 F - [repl-writer-worker-4] Invariant failure _inUnitOfWork() ActiveNotInUnitOfWork src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp 318
 2019-11-12T09:08:16.116+0000 F - [repl-writer-worker-4]
 
***aborting after invariant() failure
 
2019-11-12T09:08:16.141+0000 F - [repl-writer-worker-4] Got signal: 6 (Aborted).
 0x563c8215fc81 0x563c8215f47e 0x563c8215f516 0x2b95755b0630 0x2b95757f3377 0x2b95757f4a68 0x563c80699aa3 0x563c80785859 0x563c80775001 0x563c8077eeeb 0x563c80f52e83 0x563c80f57216 0x563c80ef0fda 0x563c81989a01 0x563c8198b1c5 0x563c8051df67 0x563c8097e193 0x563c8097e81f 0x563c80a99428 0x563c80a998dc 0x563c80a97ece 0x563c80a9d1e9 0x563c80a9de03 0x563c80a9e43c 0x563c80fd1807 0x563c80fd23c0 0x563c80fd3e05 0x563c82285bbf 0x2b95755a8ea5 0x2b95758bb8cd
 ----- BEGIN BACKTRACE -----
 
{"backtrace":[\\{"b":"563C7F9E0000","o":"277FC81","s":"_ZN5mongo15printStackTraceERSo"}
 
,\{"b":"563C7F9E0000","o":"277F47E"},\{"b":"563C7F9E0000","o":"277F516"},\{"b":"2B95755A1000","o":"F630"},\{"b":"2B95757BD000","o":"36377","s":"gsignal"},\{"b":"2B95757BD000","o":"37A68","s":"abort"},{"b":"563C7F9E0000","o":"CB9AA3","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},\{"b":"563C7F9E0000","o":"DA5859","s":"_ZN5mongo22WiredTigerRecoveryUnit14registerChangeEPNS_12RecoveryUnit6ChangeE"},\{"b":"563C7F9E0000","o":"D95001","s":"_ZN5mongo21WiredTigerRecordStore17_increaseDataSizeEPNS_16OperationContextEl"},\{"b":"563C7F9E0000","o":"D9EEEB","s":"_ZN5mongo21WiredTigerRecordStore12updateRecordEPNS_16OperationContextERKNS_8RecordIdEPKci"},\{"b":"563C7F9E0000","o":"1572E83","s":"_ZN5mongo18DurableCatalogImpl11putMetaDataEPNS_16OperationContextERKNS_15NamespaceStringERNS_26BSONCollectionCatalogEntry8MetaDataE"},\{"b":"563C7F9E0000","o":"1577216","s":"_ZN5mongo18DurableCatalogImpl18setIndexIsMultikeyEPNS_16OperationContextENS_15NamespaceStringENS_10StringDataERKSt6vectorISt3setImSt4lessImESaImEESaISA_EE"},\{"b":"563C7F9E0000","o":"1510FDA","s":"_ZN5mongo21IndexCatalogEntryImpl11setMultikeyEPNS_16OperationContextERKSt6vectorISt3setImSt4lessImESaImEESaIS8_EE"},\{"b":"563C7F9E0000","o":"1FA9A01","s":"_ZN5mongo25AbstractIndexAccessMethod10insertKeysEPNS_16OperationContextERKSt6vectorINS_7BSONObjESaIS4_EES8_RKS3_ISt3setImSt4lessImESaImEESaISD_EERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPNS_12InsertResultE"},\{"b":"563C7F9E0000","o":"1FAB1C5","s":"_ZN5mongo25AbstractIndexAccessMethod6insertEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPNS_12InsertResultE"},\{"b":"563C7F9E0000","o":"B3DF67"},\{"b":"563C7F9E0000","o":"F9E193","s":"_ZN5mongo4repl24CollectionBulkLoaderImpl25_addDocumentToIndexBlocksERKNS_7BSONObjERKNS_8RecordIdE"},{"b":"563C7F9E0000","o":"F9E81F","s":"_ZN5mongo4repl24CollectionBulkLoaderImpl15insertDocumentsEN9__gnu_cxx17__normal_iteratorIPKNS_7BSONObjESt6vectorIS4_SaIS4_EEEESA_"},{"b":"563C7F9E0000","o":"10B9428","s":"_ZN5mongo4repl16CollectionCloner24_insertDocumentsCallbackERKNS_8executor12TaskExecutor12CallbackArgsESt10shared_ptrINS0_23CallbackCompletionGuardINS_6StatusEEEE"},\{"b":"563C7F9E0000","o":"10B98DC"},\{"b":"563C7F9E0000","o":"10B7ECE"},\{"b":"563C7F9E0000","o":"10BD1E9"},\{"b":"563C7F9E0000","o":"10BDE03","s":"_ZN5mongo4repl10TaskRunner9_runTasksEv"},\{"b":"563C7F9E0000","o":"10BE43C"},\{"b":"563C7F9E0000","o":"15F1807","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE"},\{"b":"563C7F9E0000","o":"15F23C0","s":"_ZN5mongo10ThreadPool13_consumeTasksEv"},{"b":"563C7F9E0000","o":"15F3E05","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},\{"b":"563C7F9E0000","o":"28A5BBF"},\{"b":"2B95755A1000","o":"7EA5"},\{"b":"2B95757BD000","o":"FE8CD","s":"clone"}],"processInfo":{ "mongodbVersion" : "4.2.0", "gitVersion" : "a4b751dcf51dd249c5865812b390cfd1c0129c30", "compiledModules" : [], "uname" :
 
{ "sysname" : "Linux", "release" : "3.10.0-1062.1.2.el7.x86_64", "version" : "#1 SMP Mon Sep 16 14:19:51 EDT 2019", "machine" : "x86_64" }
 
, "somap" : [ \{ "b" : "563C7F9E0000", "elfType" : 3, "buildId" : "E8D75D13E92279CB6AF8104353A95729FD262FAB" }, \{ "b" : "7FFCFF652000", "elfType" : 3, "buildId" : "086694BE6E21A0A8272FB65847371DAF85F23422" }, \{ "b" : "2B9574125000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "FEA34DECD17FC0AE4A0A1962AB2C8F72BC6D1B02" }, \{ "b" : "2B957438F000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "6AE7534DD2B3C41A984BA43D85A2B4FBA378FB98" }, \{ "b" : "2B95745A8000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "4A7C42F51D767226113C14433CF47B7FE2034FC5" }, \{ "b" : "2B9574A0B000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "F3223C4C9DD8824C897E96D992DB8BA55C5A755C" }, \{ "b" : "2B9574C7D000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "B16FC9C912150101DEA3E14E5FCFD9E9F71E5A45" }, \{ "b" : "2B9574E81000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "85B4DD66FF1213DC51BF382E9D162E7C2C9B73B6" }, \{ "b" : "2B9575089000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "D1F552CF3C05D7E7394EDDA2F1F377A553D593F8" }, \{ "b" : "2B957538B000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "DAC0179F4555AEFEC9E97476201802FD20C03EC5" }, \{ "b" : "2B95755A1000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "FEBFED597867C1FC05ACE1B5FC7DB5AC93364C2E" }, \{ "b" : "2B95757BD000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "D6B09772D17878A32EDD32EA18751209EE9BE5A7" }, \{ "b" : "2B9573F01000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "90493AE08BD5E200887971C5DB8D18E97B68A878" }, \{ "b" : "2B9575B8B000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "F4123103FB2318594448C44E47091DD68D1C78C0" }, \{ "b" : "2B9575DBE000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "B7DB4DA04CC1A8BD87EEB7449B9670FFBAD0946D" }, \{ "b" : "2B9575FEB000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "B6321C434B5C7386B144B925CEE2798D269FDDF5" }, \{ "b" : "2B9576244000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "BDA454441F59F41D2DA36E13CEA1FC4CE95B2BBB" }, \{ "b" : "2B957646C000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "D61EB90C9F32CA6E81E7FAC437F2C496438C8D9E" }, \{ "b" : "2B957679B000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "1E366A2153AD7488EE72E989D9AD6BD458BE8EDE" }, \{ "b" : "2B95769CB000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "5ACAE2CD55FAC5DBD4929039B549079EBF3F1CA1" }, \{ "b" : "2B9576BCF000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "6B934070E9C265F392CE65B12594E362172D1E09" }, \{ "b" : "2B9576DD4000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "1DF824DEEBA1F1597FB3166D8A38A390801E0357" }, \{ "b" : "2B9577012000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "BCC30853830CD911E58700591830DF51ABCBD7BA" }, \{ "b" : "2B957725F000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "B64919F53B93FF41BFCCF022042E454012B2CD20" }, \{ "b" : "2B9577548000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "A9B3906192687CC45D483AE3C58C8AF745A6726A" }, \{ "b" : "2B957777B000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "E4C7298B74FEEADC4DDE40CDD8C4D6B85FE09ADE" }, \{ "b" : "2B957797F000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "8832509D0687D79342E29FC6FEC587EA85C04CF4" }, \{ "b" : "2B9577B8E000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "FC68D1DA42FB89A81E025368BCA66E5CD1AF82B6" }, \{ "b" : "2B9577DE3000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "B9D5F73428BD6AD68C96986B57BEA3B7CEDB9745" }, \{ "b" : "2B9577FF9000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "94B3BCB669126166B77CDCE6092679A6AA2004C8" }, \{ "b" : "2B9578209000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8CA73C16CFEB9A8B5660015B9223B09F87041CAD" }, \{ "b" : "2B957840D000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "9AF2AD92DADE046C6260DCCF02846BF78ABC658C" }, \{ "b" : "2B957862A000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "D2DD4DA3FDE1477D25BFFF80F3A25FDB541A8179" }, \{ "b" : "2B9578851000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "601BAA18FAB15899F0E634E2C782D28A99023A54" }, \{ "b" : "2B9578A88000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "F5B144F9F5D9BE451C80211B34DB2CE348E039B6" }, \{ "b" : "2B9578CEA000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "197680DAE6538245CB99723E57447C4EF2E98362" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x563c8215fc81]
 mongod(+0x277F47E) [0x563c8215f47e]
 mongod(+0x277F516) [0x563c8215f516]
 libpthread.so.0(+0xF630) [0x2b95755b0630]
 libc.so.6(gsignal+0x37) [0x2b95757f3377]
 libc.so.6(abort+0x148) [0x2b95757f4a68]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x563c80699aa3]
 mongod(_ZN5mongo22WiredTigerRecoveryUnit14registerChangeEPNS_12RecoveryUnit6ChangeE+0xB9) [0x563c80785859]
 mongod(_ZN5mongo21WiredTigerRecordStore17_increaseDataSizeEPNS_16OperationContextEl+0x81) [0x563c80775001]
 mongod(_ZN5mongo21WiredTigerRecordStore12updateRecordEPNS_16OperationContextERKNS_8RecordIdEPKci+0x18B) [0x563c8077eeeb]
 mongod(_ZN5mongo18DurableCatalogImpl11putMetaDataEPNS_16OperationContextERKNS_15NamespaceStringERNS_26BSONCollectionCatalogEntry8MetaDataE+0x4E3) [0x563c80f52e83]
 mongod(_ZN5mongo18DurableCatalogImpl18setIndexIsMultikeyEPNS_16OperationContextENS_15NamespaceStringENS_10StringDataERKSt6vectorISt3setImSt4lessImESaImEESaISA_EE+0x276) [0x563c80f57216]
 mongod(_ZN5mongo21IndexCatalogEntryImpl11setMultikeyEPNS_16OperationContextERKSt6vectorISt3setImSt4lessImESaImEESaIS8_EE+0x2EA) [0x563c80ef0fda]
 mongod(_ZN5mongo25AbstractIndexAccessMethod10insertKeysEPNS_16OperationContextERKSt6vectorINS_7BSONObjESaIS4_EES8_RKS3_ISt3setImSt4lessImESaImEESaISD_EERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPNS_12InsertResultE+0x3A1) [0x563c81989a01]
 mongod(_ZN5mongo25AbstractIndexAccessMethod6insertEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPNS_12InsertResultE+0x265) [0x563c8198b1c5]
 mongod(+0xB3DF67) [0x563c8051df67]
 mongod(_ZN5mongo4repl24CollectionBulkLoaderImpl25_addDocumentToIndexBlocksERKNS_7BSONObjERKNS_8RecordIdE+0x93) [0x563c8097e193]
 mongod(_ZN5mongo4repl24CollectionBulkLoaderImpl15insertDocumentsEN9__gnu_cxx17__normal_iteratorIPKNS_7BSONObjESt6vectorIS4_SaIS4_EEEESA_+0x55F) [0x563c8097e81f]
 mongod(_ZN5mongo4repl16CollectionCloner24_insertDocumentsCallbackERKNS_8executor12TaskExecutor12CallbackArgsESt10shared_ptrINS0_23CallbackCompletionGuardINS_6StatusEEEE+0x188) [0x563c80a99428]
 mongod(+0x10B98DC) [0x563c80a998dc]
 mongod(+0x10B7ECE) [0x563c80a97ece]
 mongod(+0x10BD1E9) [0x563c80a9d1e9]
 mongod(_ZN5mongo4repl10TaskRunner9_runTasksEv+0xB3) [0x563c80a9de03]
 mongod(+0x10BE43C) [0x563c80a9e43c]
 mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0xF7) [0x563c80fd1807]
 mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xA0) [0x563c80fd23c0]
 mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x95) [0x563c80fd3e05]
 mongod(+0x28A5BBF) [0x563c82285bbf]
 libpthread.so.0(+0x7EA5) [0x2b95755a8ea5]
 libc.so.6(clone+0x6D) [0x2b95758bb8cd]
 ----- END BACKTRACE -----

 



 Comments   
Comment by Dmitry Agranat [ 18/Nov/19 ]

Glad to hear this issue was resolved dimaa6@gmail.com after upgrading to MongoDB 4.2.1

Comment by Dmytro Bogdanov [ 12/Nov/19 ]

Dima, thank you! I have upgraded MongoDB on the new replica member to 4.2.1 and it stayed alive and replicating data much longer than before. I hope it will work to the end of the replication.

Thanks,

Dima

Comment by Dmitry Agranat [ 12/Nov/19 ]

Hi dimaa6@gmail.com,

Thank you for the report. This issue appears to be related to SERVER-42484 which is fixed in 4.2.1 Could you try MongoDB 4.2.1 and report back with the results?

Thank you,
Dima

Generated at Thu Feb 08 05:06:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.