[SERVER-37280] Foreground index rebuild crashes database with Out of memory Created: 24/Sep/18  Updated: 06/Dec/22  Resolved: 25/Oct/18

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.4.16
Fix Version/s: None

Type: Question Priority: Critical - P2
Reporter: Tanveer Madan Marate Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Server Triage
Participants:

 Description   

Hi All,

We are running MongoDB community 3.4.16 and trying to build an index foreground on a collection of size 2.8 Billion documents. We are facing issues with this

 

  1. We are unable to run multiple foreground index builds on the same collection. The documentation states that we can run simultaneous foreground index builds on a collection. But we see that only one of the index build runs, which it blocks out the other. Can you please advise on whether it is possible to run simultaneous foreground index builds? 
  2.  We see that from the documentation, it says foreground index builds freeze the database on which the index is being built. But we see that commands like "show dbs" also hangs
  3. We are using parameter maxIndexBuildMemoryUsageMegabytes to increase the speed of the foreground index build. We have configured WiredTiger cache size using cacheSizeGB to 425 on a 512GB machine. After setting the parameter maxIndexBuildMemoryUsageMegabytes  to 307200 (300GB) and running only one foreground index build ( as pointed above we see that only one foreground index build runs) we see that when the index rebuild has reached "Index Build Index Build 99%" (msg from db.currentOp for the opid of the index build), the database crashes with out of memory error. Can you please advise on the correct usage of this parameter to meet the requirement of the index build?

          

Mon Sep 24 01:09:46.456 F -        [conn121] out of memory.

 0x564d0c7b3061 0x564d0c7b2694 0x564d0c975e9b 0x564d0d2b39e2 0x564d0bea3955 0x564d0bea39f2 0x564d0beaac2a 0x564d0beab2bd 0x564d0be9a5ca 0x564d0bc4b532 0x564d0bc4bae5 0x564d0bc98830 0x564d0bc9d7e5 0x564d0bc9ead1 0x564d0c2bce10 0x564d0bebf2a8 0x564d0babe32d 0x564d0babec5d 0x564d0c718741 0x7f28949f5de5 0x7f289471fbad
----- BEGIN BACKTRACE -----

{"backtrace":[\{"b":"564D0B231000","o":"1582061","s":"_ZN5mongo15printStackTraceERSo"}

,{"b":"564D0B231000","o":"1581694","s":"ZN5mongo29reportOutOfMemoryErrorAndExitEv"},{"b":"564D0B231000","o":"1744E9B"},{"b":"564D0B231000","o":"20829E2","s":"tc_newarray_nothrow"},{"b":"564D0B231000","o":"C72955","s":"_ZSt20get_temporary_bufferISt4pairIN5mongo7BSONObjENS1_8RecordIdEEES0_IPT_lEl"},{"b":"564D0B231000","o":"C729F2","s":"_ZNSt17_Temporary_bufferISt15_Deque_iteratorISt4pairIN5mongo7BSONObjENS2_8RecordIdEERS5_PS5_ES5_EC1ES8_S8"},{"b":"564D0B231000","o":"C79C2A","s":"ZN5mongo6sorter13NoLimitSorterINS_7BSONObjENS_8RecordIdENS_27BtreeExternalSortComparisonEE4sortEv"},{"b":"564D0B231000","o":"C7A2BD","s":"_ZN5mongo6sorter13NoLimitSorterINS_7BSONObjENS_8RecordIdENS_27BtreeExternalSortComparisonEE4doneEv"},{"b":"564D0B231000","o":"C695CA","s":"_ZN5mongo17IndexAccessMethod10commitBulkEPNS_16OperationContextESt10unique_ptrINS0_11BulkBuilderESt14default_deleteIS4_EEbbPSt3setINS_8RecordIdESt4lessIS9_ESaIS9_EE"},{"b":"564D0B231000","o":"A1A532","s":"_ZN5mongo15MultiIndexBlock13doneInsertingEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE"},{"b":"564D0B231000","o":"A1AAE5","s":"_ZN5mongo15MultiIndexBlock30insertAllDocumentsInCollectionEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE"},{"b":"564D0B231000","o":"A67830","s":"_ZN5mongo14CmdCreateIndex3runEPNS_16OperationContextERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE"},{"b":"564D0B231000","o":"A6C7E5","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{"b":"564D0B231000","o":"A6DAD1","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"564D0B231000","o":"108BE10","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"564D0B231000","o":"C8E2A8","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"564D0B231000","o":"88D32D","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},{"b":"564D0B231000","o":"88DC5D"},{"b":"564D0B231000","o":"14E7741"},{"b":"7F28949EE000","o":"7DE5"},{"b":"7F2894621000","o":"FEBAD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.16", "gitVersion" : "0d6a9242c11b99ddadcfb6e86a850b6ba487530a", "compiledModules" : [], "uname" :

{ "sysname" : "Linux", "release" : "4.14.67-66.56.amzn1.x86_64", "version" : "#1 SMP Tue Sep 4 22:03:21 UTC 2018", "machine" : "x86_64" }

, "somap" : [ { "b" : "564D0B231000", "elfType" : 3, "buildId" : "695F63264AD30445D5D0132213800E674C5AD3A6" }, { "b" : "7FFF43755000", "elfType" : 3, "buildId" : "5AD47F0C2A5C2B6F33C2712F9E30AE3B725C7C6F" }, { "b" : "7F289598C000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "9C4EB34A346260F2A77746F4E5ED837619137DB7" }, { "b" : "7F289552E000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "0B7F7487280FE68AF9302A282FAE37776A99BC80" }, { "b" : "7F2895326000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "F2701E2A24459D5B55DF5549D585F091E7BCF07A" }, { "b" : "7F2895122000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "0E5CD5BAA5EE8BF3648A5031B088F9A78C89364F" }, { "b" : "7F2894E20000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "07FB92AFEF1756F093371CE60C3AE85DD3A06325" }, { "b" : "7F2894C0A000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A03C9A80E995ED5F43077AB754A258FA0E34C3CD" }, { "b" : "7F28949EE000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D973C39D1900DC61D8519C653C3BC405692DE563" }, { "b" : "7F2894621000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "AF310F56618FC1EF9158973484F60942F11CC0FB" }, { "b" : "7F2895BFD000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8402047FD4A85B3CD1142346EA06BCD6E15A82A3" }, { "b" : "7F28943D4000", "path" : "/usr/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "B4C91D3D76D4819DDD1F7D2360F600AC13280628" }, { "b" : "7F28940EC000", "path" : "/usr/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "295B8EDE0A3D565218B9086DECC047FB28F235E5" }, { "b" : "7F2893EE9000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "5C01209C5AE1B1714F19B07EB58F2A1274B69DC8" }, { "b" : "7F2893CB6000", "path" : "/usr/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "68572D6C73CAEE5B0F36E50873B092275DE5697C" }, { "b" : "7F2893AA0000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "89C6AF118B6B4FB6A73AE1813E2C8BDD722956D1" }, { "b" : "7F2893892000", "path" : "/usr/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "2B9C3C8ED0AC9CE675617BDB4174D5EACF1FB0FF" }, { "b" : "7F289368F000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "37A58210FA50C91E09387765408A92909468D25B" }, { "b" : "7F2893476000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "9E5E0BF5F22DE7555BC4B9853240817147489258" }, { "b" : "7F2893255000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "F5054DC94443326819FBF3065CFDF5E4726F57EE" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x564d0c7b3061]
 mongod(_ZN5mongo29reportOutOfMemoryErrorAndExitEv+0x84) [0x564d0c7b2694]
 mongod(+0x1744E9B) [0x564d0c975e9b]
 mongod(tc_newarray_nothrow+0x222) [0x564d0d2b39e2]
 mongod(_ZSt20get_temporary_bufferISt4pairIN5mongo7BSONObjENS1_8RecordIdEEES0_IPT_lEl+0x45) [0x564d0bea3955]
 mongod(ZNSt17_Temporary_bufferISt15_Deque_iteratorISt4pairIN5mongo7BSONObjENS2_8RecordIdEERS5_PS5_ES5_EC1ES8_S8+0x72) [0x564d0bea39f2]
 mongod(_ZN5mongo6sorter13NoLimitSorterINS_7BSONObjENS_8RecordIdENS_27BtreeExternalSortComparisonEE4sortEv+0xBA) [0x564d0beaac2a]
 mongod(_ZN5mongo6sorter13NoLimitSorterINS_7BSONObjENS_8RecordIdENS_27BtreeExternalSortComparisonEE4doneEv+0x4D) [0x564d0beab2bd]
 mongod(_ZN5mongo17IndexAccessMethod10commitBulkEPNS_16OperationContextESt10unique_ptrINS0_11BulkBuilderESt14default_deleteIS4_EEbbPSt3setINS_8RecordIdESt4lessIS9_ESaIS9_EE+0x7A) [0x564d0be9a5ca]
 mongod(_ZN5mongo15MultiIndexBlock13doneInsertingEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE+0x1C2) [0x564d0bc4b532]
 mongod(_ZN5mongo15MultiIndexBlock30insertAllDocumentsInCollectionEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE+0x465) [0x564d0bc4bae5]
 mongod(ZN5mongo14CmdCreateIndex3runEPNS_16OperationContextERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE+0x1100) [0x564d0bc98830]
 mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x935) [0x564d0bc9d7e5]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xF81) [0x564d0bc9ead1]
 mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x240) [0x564d0c2bce10]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xD38) [0x564d0bebf2a8]
 mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x564d0babe32d]
 mongod(+0x88DC5D) [0x564d0babec5d]
 mongod(+0x14E7741) [0x564d0c718741]
 libpthread.so.0(+0x7DE5) [0x7f28949f5de5]
 libc.so.6(clone+0x6D) [0x7f289471fbad]
-----  END BACKTRACE  -----

 

 

         

         



 Comments   
Comment by Nick Brewer [ 24/Sep/18 ]

tanveermadan@gmail.com Your settings for both the WiredTiger cacheSizeGB and the maxIndexBuildMemoryUsageMegabytes server parameter appear to be much too high - you're potentially using most of your memory for WiredTiger cache, however index builds are being instructed to use up to 300GB of memory, when that is most likely not all available.

To get a better sense of how resources are being utilized on this machine, we'll need to see:

  • An archive (tar or zip) of the diagnostic.data directory for the mongod
  • The mongod.log file from the time you began the index build, up until the out of memory error

Note that per the documentation, commands such as listDatabases that require a read or write lock on all databases must wait for a foreground index build to complete.

Thanks,
-Nick

Generated at Thu Feb 08 04:45:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.