-
Type: Bug
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.6.13
-
Component/s: Index Maintenance, Replication, Sharding, Stability
-
None
-
ALL
We've created on our cluster an index in the foreground which blocked not only the database on which the index was being built but it blocked also all other databases during the whole index build.
Here are the related logs from the primary during index build:
2020-01-30T17:09:25.975+0100 I INDEX [conn26758520] build index on: price_alert.priceAlert properties: { v: 2, key: { priceAlertId: 1 }, name: "priceAlertId", ns: "price_alert.priceAlert" } 2020-01-30T17:09:25.975+0100 I INDEX [conn26758520] building index using bulk method; build may temporarily use up to 500 megabytes of RAM 2020-01-30T17:09:28.004+0100 I - [conn26758520] Index Build: 74000/22140565 0% ... 2020-01-30T17:16:06.002+0100 I - [conn26758520] Index Build: 21622800/22140565 97% 2020-01-30T17:16:18.046+0100 I INDEX [conn26758520] build index done. scanned 22140565 total records. 412 secs 2020-01-30T17:16:18.048+0100 I COMMAND [conn26758520] command price_alert.$cmd command: createIndexes { createIndexes: "priceAlert", indexes: [ { key: { priceAlertId: 1 }, name: "priceAlertId", ns: "price_alert.priceAlert" } ], shardVersion: [ Timestamp(0, 0), ObjectId('000000000000000000000000') ], lsid: { id: UUID("1e225ae9-0a74-4a71-99ca-dcca84b62fb5"), uid: BinData(0, CA28DAE5564B626F92E2D21DD43742531AE0FED5604FFFEB8AB31BB5F101B7EC) }, $clusterTime: { clusterTime: Timestamp(1580400565, 19066), signature: { hash: BinData(0, 738B9B6A8BEB897ADDAB8B5D9E04DF77626A574E), keyId: 6729445335938105350 } }, $client: { driver: { name: "mongo-java-driver", version: "3.8.2" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.16.0-10-amd64" }, platform: "Java/Oracle Corporation/1.8.0_171-8u171-b11-1~bpo8+1-b11", mongos: { host: "mongo-hotel-01:27017", client: "10.142.145.194:44050", version: "3.6.13" } }, $configServerState: { opTime: { ts: Timestamp(1580400565, 18247), t: 14 } }, $db: "price_alert" } numYields:0 reslen:364 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3, W: 1 } }, Collection: { acquireCount: { w: 2 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_msg 412115ms
You see that the index was built from 17:09:25 to 17:16:18.
Our log file analysis by mtools shows (see attached) that everything was blocked during this time period. Many requests got queued and were executed only after the index build had finished.
It might be a coincidence but one of the 6 routers even crashed during the index build:
2020-01-30T17:14:21.324+0100 I - [listener] pthread_create failed: Resource temporarily unavailable 2020-01-30T17:14:21.324+0100 W EXECUTOR [conn332411] Terminating session due to error: InternalError: failed to create service entry worker thread 2020-01-30T17:14:21.329+0100 F - [LogicalSessionCacheRefresh] std::exception::what(): Resource temporarily unavailable Actual exception type: std::system_error 0x5581643011b1 0x558164300b95 0x5581643f5276 0x5581643f52c1 0x558163ac8c6f 0x558163ac94c8 0x558163abd042 0x558163ab1f8a 0x558163ab6d22 0x558163ab7562 0x5581637a8994 0x5581637a8ca3 0x558163bbed54 0x558163bc0238 0x55816378e02c 0x5581644104d0 0x7fe04e1c14a4 0x7fe04df03d0f ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"55816320E000","o":"10F31B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55816320E000","o":"10F2B95"},{"b":"55816320E000","o":"11E7276","s":"_ZN10__cxxabiv111__terminateEPFvvE"},{"b":"55816320E000","o":"11E72C1"},{"b":"55816320E000","o":"8BAC6F","s":"_ZN5mongo10ThreadPool25_startWorkerThread_inlockEv"},{"b":"55816320E000","o":"8BB4C8","s":"_ZN5mongo10ThreadPool8scheduleESt8functionIFvvEE"},{"b":"55816320E000","o":"8AF042","s":"_ZN5mongo30ConfigServerCatalogCacheLoader14getChunksSinceERKNS_15NamespaceStringENS_12ChunkVersionESt8functionIFvPNS_16OperationContextENS_10StatusWithINS_18CatalogCacheLoader26CollectionAndChangedChunksEEEEE"},{"b":"55816320E000","o":"8A3F8A","s":"_ZN5mongo12CatalogCache26_scheduleCollectionRefreshENS_8WithLockESt10shared_ptrINS0_17DatabaseInfoEntryEES2_INS_12ChunkManagerEERKNS_15NamespaceStringEi"},{"b":"55816320E000","o":"8A8D22","s":"_ZN5mongo12CatalogCache24getCollectionRoutingInfoEPNS_16OperationContextERKNS_15NamespaceStringE"},{"b":"55816320E000","o":"8A9562","s":"_ZN5mongo12CatalogCache42getShardedCollectionRoutingInfoWithRefreshEPNS_16OperationContextERKNS_15NamespaceStringE"},{"b":"55816320E000","o":"59A994","s":"_ZN5mongo25SessionsCollectionSharded32_checkCacheForSessionsCollectionEPNS_16OperationContextE"},{"b":"55816320E000","o":"59ACA3","s":"_ZN5mongo25SessionsCollectionSharded23setupSessionsCollectionEPNS_16OperationContextE"},{"b":"55816320E000","o":"9B0D54","s":"_ZN5mongo23LogicalSessionCacheImpl8_refreshEPNS_6ClientE"},{"b":"55816320E000","o":"9B2238","s":"_ZN5mongo23LogicalSessionCacheImpl16_periodicRefreshEPNS_6ClientE"},{"b":"55816320E000","o":"58002C"},{"b":"55816320E000","o":"12024D0"},{"b":"7FE04E1BA000","o":"74A4"},{"b":"7FE04DE1B000","o":"E8D0F","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.6.13", "gitVersion" : "db3c76679b7a3d9b443a0e1b3e45ed02b88c539f", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.9.0-11-amd64", "version" : "#1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20)", "machine" : "x86_64" }, "somap" : [ { "b" : "55816320E000", "elfType" : 3, "buildId" : "072ABC16ACA2D355506A39C14C0655EBD3C10559" }, { "b" : "7FFEDF5D4000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "E4710F184535B7465A293C8D84196F236A12D2CF" }, { "b" : "7FE04F404000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "EAD5FD817712E63C1212D1EE7D7EE1B9C29F93A7" }, { "b" : "7FE04EF6A000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1", "elfType" : 3, "buildId" : "BC8265C9A7EB28F1A9677B7437E4561F383C44D9" }, { "b" : "7FE04ECFE000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.1", "elfType" : 3, "buildId" : "4EAE4DD6DA20F6CECBEA6688C8ED8E9987CD800F" }, { "b" : "7FE04EAFA000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "DB2CAEEEC37482A98AB1416D0A9AFE2944930DE9" }, { "b" : "7FE04E8F2000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "86B35D63FACD97D22973E99EE9863F7714C4F53A" }, { "b" : "7FE04E5EE000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "4E49714C557CE0472C798F39365CA10F9C0E1933" }, { "b" : "7FE04E3D7000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "51AD5FD294CD6C813BED40717347A53434B80B7A" }, { "b" : "7FE04E1BA000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "16D609487BCC4ACBAC29A4EAA2DDA0D2F56211EC" }, { "b" : "7FE04DE1B000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "775143E680FF0CD4CD51CCE1CE8CA216E635A1D6" }, { "b" : "7FE04F61B000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "606DF9C355103E82140D513BC7A25A635591C153" }, { "b" : "7FE04DC09000", "path" : "/lib/x86_64-linux-gnu/libnss_files.so.2", "elfType" : 3, "buildId" : "8ADB8C8CDDE0E4C3C3BABB8874AE69DCA15A666A" }, { "b" : "7FE04DA03000", "path" : "/lib/x86_64-linux-gnu/libnss_dns.so.2", "elfType" : 3, "buildId" : "B65AA56F4CECC441FCE7F85F0B4AA0F2815E4E34" } ] }} mongos(_ZN5mongo15printStackTraceERSo+0x41) [0x5581643011b1] mongos(+0x10F2B95) [0x558164300b95] mongos(_ZN10__cxxabiv111__terminateEPFvvE+0x6) [0x5581643f5276] mongos(+0x11E72C1) [0x5581643f52c1] mongos(_ZN5mongo10ThreadPool25_startWorkerThread_inlockEv+0x99F) [0x558163ac8c6f] mongos(_ZN5mongo10ThreadPool8scheduleESt8functionIFvvEE+0x398) [0x558163ac94c8] mongos(_ZN5mongo30ConfigServerCatalogCacheLoader14getChunksSinceERKNS_15NamespaceStringENS_12ChunkVersionESt8functionIFvPNS_16OperationContextENS_10StatusWithINS_18CatalogCacheLoader26CollectionAndChangedChunksEEEEE+0x222) [0x558163abd042] mongos(_ZN5mongo12CatalogCache26_scheduleCollectionRefreshENS_8WithLockESt10shared_ptrINS0_17DatabaseInfoEntryEES2_INS_12ChunkManagerEERKNS_15NamespaceStringEi+0xB2A) [0x558163ab1f8a] mongos(_ZN5mongo12CatalogCache24getCollectionRoutingInfoEPNS_16OperationContextERKNS_15NamespaceStringE+0x952) [0x558163ab6d22] mongos(_ZN5mongo12CatalogCache42getShardedCollectionRoutingInfoWithRefreshEPNS_16OperationContextERKNS_15NamespaceStringE+0x52) [0x558163ab7562] mongos(_ZN5mongo25SessionsCollectionSharded32_checkCacheForSessionsCollectionEPNS_16OperationContextE+0x64) [0x5581637a8994] mongos(_ZN5mongo25SessionsCollectionSharded23setupSessionsCollectionEPNS_16OperationContextE+0x33) [0x5581637a8ca3] mongos(_ZN5mongo23LogicalSessionCacheImpl8_refreshEPNS_6ClientE+0x164) [0x558163bbed54] mongos(_ZN5mongo23LogicalSessionCacheImpl16_periodicRefreshEPNS_6ClientE+0x28) [0x558163bc0238] mongos(+0x58002C) [0x55816378e02c] mongos(+0x12024D0) [0x5581644104d0] libpthread.so.0(+0x74A4) [0x7fe04e1c14a4] libc.so.6(clone+0x3F) [0x7fe04df03d0f] ----- END BACKTRACE -----
The other 5 routers survived.
I can provide you the mongod and mongos logs.
- related to
-
SERVER-47032 Long running exclusive locks can end up blocking all operations if all tickets are exhausted
- Closed