[SERVER-25131] CollectionBulkLoaderImpl should release locks on task errors Created: 18/Jul/16  Updated: 21/Dec/16  Resolved: 19/Sep/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.3.14

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Scott Hernandez (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File bypassdocvalidation.tar.gz     File setupInitSync.sh     File test.js    
Issue Links:
Depends
is depended on by SERVER-26179 Do not join the TaskRunner within a r... Closed
Related
related to SERVER-27488 Unblacklist bypass_doc_validation.js ... Closed
is related to SERVER-25969 When featureCompatibilityVersion is 3... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 18 (08/05/16), Repl 2016-09-19, Repl 2016-10-10
Participants:

 Description   

When running initial sync after bypass_doc_validation.js, the server fasserts with the following failure:

[ReplicaSetFixture:job9:initsync] 2016-07-18T18:44:51.855+0000 I -        [InitialSyncInserters-0] Invariant failure _requests.empty() src/mongo/db/concurrency/lock_state.cpp 197
[ReplicaSetFixture:job9:initsync] 2016-07-18T18:44:51.855+0000 I -        [InitialSyncInserters-0]
[ReplicaSetFixture:job9:initsync] 
[ReplicaSetFixture:job9:initsync] ***aborting after invariant() failure
[ReplicaSetFixture:job9:initsync] 
[ReplicaSetFixture:job9:initsync] 
[ReplicaSetFixture:job9:initsync] 2016-07-18T18:44:51.863+0000 F -        [InitialSyncInserters-0] Got signal: 6 (Aborted).
[ReplicaSetFixture:job9:initsync] 
[ReplicaSetFixture:job9:initsync]  0x7f746a214711 0x7f746a2134a9 0x7f746a21398d 0x7f74665227e0 0x7f74661b15e5 0x7f74661b2dc5 0x7f746951a4d0 0x7f746980f2c8 0x7f7469a368d4 0x7f7469a36a01 0x7f7469d78965 0x7f746a193095 0x7f746a193ce0 0x7f746a194889 0x7f746ac751a0 0x7f746651aaa1 0x7f7466267aad
[ReplicaSetFixture:job9:initsync] ----- BEGIN BACKTRACE -----
[ReplicaSetFixture:job9:initsync] {"backtrace":[{"b":"7F7468D1C000","o":"14F8711","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F7468D1C000","o":"14F74A9"},{"b":"7F7468D1C000","o":"14F798D"},{"b":"7F7466513000","o":"F7E0"},{"b":"7F746617F000","o":"325E5","s":"gsignal"},{"b":"7F746617F000","o":"33DC5","s":"abort"},{"b":"7F7468D1C000","o":"7FE4D0","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"7F7468D1C000","o":"AF32C8","s":"_ZN5mongo10LockerImplILb0EE19assertEmptyAndResetEv"},{"b":"7F7468D1C000","o":"D1A8D4","s":"_ZN5mongo20OperationContextImplD1Ev"},{"b":"7F7468D1C000","o":"D1AA01","s":"_ZN5mongo20OperationContextImplD0Ev"},{"b":"7F7468D1C000","o":"105C965","s":"_ZN5mongo4repl10TaskRunner9_runTasksEv"},{"b":"7F7468D1C000","o":"1477095","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE"},{"b":"7F7468D1C000","o":"1477CE0","s":"_ZN5mongo10ThreadPool13_consumeTasksEv"},{"b":"7F7468D1C000","o":"1478889","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},{"b":"7F7468D1C000","o":"1F591A0","s":"execute_native_thread_routine"},{"b":"7F7466513000","o":"7AA1"},{"b":"7F746617F000","o":"E8AAD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.3.9-288-g4d826ac-patch-578d0d2a3ff1223bf000088d", "gitVersion" : "4d826acb5648a78d0af0fefac5abe6fbbe7c854a", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "2.6.32-220.el6.x86_64", "version" : "#1 SMP Wed Nov 9 08:03:13 EST 2011", "machine" : "x86_64" }, "somap" : [ { "b" : "7F7468D1C000", "elfType" : 3, "buildId" : "68150E1FD71C21DB59390408E26CE3CFD6982AC4" }, { "b" : "7FFF75EFF000", "elfType" : 3, "buildId" : "08F634A1D22DEFF00461D50A7699DACDC97657BF" }, { "b" : "7F38098DF000", "path" : "/usr/lib64/libsasl2.so.2", "elfType" : 3, "buildId" : "E0AEE889D5BF1373F2F9EE0D448DBF3F5B5113F0" }, { "b" : "7F3809E9B000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "9B852585C66329AA02EFB28497E652A40F538E78" }, { "b" : "7F746844F000", "path" : "/usr/lib64/libnetsnmpagent.so.20", "elfType" : 3, "buildId" : "1270BB069D761BD79C79F8986BB3ED5DCAA7D06D" }, { "b" : "7F7468229000", "path" : "/usr/lib64/libnetsnmphelpers.so.20", "elfType" : 3, "buildId" : "3FA4F246A7DF00EC1355C5226C9308DC7B4AB5CD" }, { "b" : "7F7467D61000", "path" : "/usr/lib64/libnetsnmpmibs.so.20", "elfType" : 3, "buildId" : "5CDE827E0A3BF5B1BBCAB619BD0A5A4DE86AA511" }, { "b" : "7F7467A86000", "path" : "/usr/lib64/libnetsnmp.so.20", "elfType" : 3, "buildId" : "241B66A75577DF2181DEDA9C3A1AC6C43079E3DA" }, { "b" : "7F3807C35000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "1FA3BC4E18EEEB915FDD4E9BE33D0542C3FB2804" }, { "b" : "7F3806E26000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "244D2593BDE4FE657BC88572DB5DA88FA274B7F3" }, { "b" : "7F74673BA000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "7C5A504A21B221F299B1C45B9ED9C2340AEC6AEB" }, { "b" : "7F7466FD6000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "2CF03CE94B9388A10544E4EF073450851A4D6AEB" }, { "b" : "7F380D9CE000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "86FE5BC1F46B8F8AA9A7A479FF991900DB93F720" }, { "b" : "7F380E7CA000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "045D39F19533291EBD72D0EE0247F9D49BE2521E" }, { "b" : "7F380D146000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "B60EF3FCE5D4D2D8BAD2585D5CAAA1167B35DBFD" }, { "b" : "7F380C330000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "9A6E4BDFA184364D81F7DFD789474C3FB8F98A00" }, { "b" : "7F380D913000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "B3BD4C596D72FCBE4607C86FEEC14F47B46D0DCC" }, { "b" : "7F380D97F000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "24D3AB3DB0F38C7515FEADF82191651DA4117A18" }, { "b" : "7F3810AF9000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "F3EEBD18E66EB139EA4D76CDFA86D643ABCF0070" }, { "b" : "7F380B765000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "DDF6449707FD4C49DDE32A293EEE9AC218BFC460" }, { "b" : "7F380AD2E000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "B21E32412356755F1851BAE219A0C8EFDAEEC686" }, { "b" : "7F3807A47000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "EF3AACAFD6BF71BB861F194C1559153FB0B020E2" }, { "b" : "7F380841B000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "DDE6774979156442185836150FC0785170F8001F" }, { "b" : "7F380B217000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "13FFCD68952B7715DDF34C9321D82E3041EA9006" }, { "b" : "7F380700C000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "A23DAFBCE170763BF1E836A8B26113F9CD20E0DA" }, { "b" : "7F3806609000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "3BCCABE75DC61BBA81AAE45D164E26EF4F9F55DB" }, { "b" : "7F7464FFE000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "083332F88CF3C61AB0184D8F397FC8BFF4548D8E" }, { "b" : "7F380BC93000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "545478030DF991A635CC5E3258A3F5D8A7E94561" }, { "b" : "7F380967A000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "BC86E56751E93653BD1C92975968937148A407CD" }, { "b" : "7F380B477000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "82DEB5906312B8D8F888D206DE11BC6B6FDF57D8" }, { "b" : "7F3805A0C000", "path" : "/usr/lib64/librpm.so.1", "elfType" : 3, "buildId" : "C65174824A80EDE5374CFF6143C808807160CA63" }, { "b" : "7F38063DD000", "path" : "/usr/lib64/librpmio.so.1", "elfType" : 3, "buildId" : "F858A331FA080C7E82549BE3191EB4BADE02A5C0" }, { "b" : "7F38099D4000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "E7B49911F1136073DD7DC58E8118CD9A4FBE2A19" }, { "b" : "7F380AFBE000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7F7463DAE000", "path" : "/usr/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "6855E5BF5B3634C15F01B1043BD892D727EE3C08" }, { "b" : "7F3806F6B000", "path" : "/usr/lib64/libssl3.so", "elfType" : 3, "buildId" : "9080D18543F337F6F6B5C5265B1A3D2073A0FFBF" }, { "b" : "7F380693E000", "path" : "/usr/lib64/libsmime3.so", "elfType" : 3, "buildId" : "DE75A3731E7ABC427888BA8D38E96606264FBEBB" }, { "b" : "7F38079FE000", "path" : "/usr/lib64/libnss3.so", "elfType" : 3, "buildId" : "0375F2A6DA6EDCF870C52584B71798AC9003CFF2" }, { "b" : "7F3806BD2000", "path" : "/usr/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "F3A25CFCCA8191255ECFCFCD62248E393AFF3D01" }, { "b" : "7F38085CE000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "1D3CD12F36DFB9E232953D3B73C34F8C0EF1004D" }, { "b" : "7F3807BC9000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "535FB904872A936ECC2E926C612B1B2BFD0FB722" }, { "b" : "7F3806D8B000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "29B15E2260EA9A50E0993DEEF7ABD8334F37E6B9" }, { "b" : "7F3806788000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "58BAC04A1DB3964A8F594EFFBE4838AD01214EDC" }, { "b" : "7F3808D69000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "B4576BE308DDCF7BC31F7304E4734C3D846D0236" }, { "b" : "7F3806358000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "1250B1D041DD7552F0C870BB188DC3A34DF2651D" }, { "b" : "7F3807542000", "path" : "/usr/lib64/libelf.so.1", "elfType" : 3, "buildId" : "50517407A07B8D6C9A55A392E99246B52E8BFEEA" }, { "b" : "7F3805321000", "path" : "/usr/lib64/liblzma.so.0", "elfType" : 3, "buildId" : "6FF9BAEEEE9DDEEF2DFA5CBD36147A75891C0AD4" }, { "b" : "7F3803CF4000", "path" : "/usr/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "6BDB4E1990D6EBA12A5C8D39A7650DB8798BF568" }, { "b" : "7F38042F0000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "A436538388F1F25113FDA834CA2EED524EFA17D6" }, { "b" : "7F38048E8000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "26CC708AC7C0FC1797A2340C024F0ADD0CE054D8" }, { "b" : "7F3804D74000", "path" : "/lib64/libdb-4.7.so", "elfType" : 3, "buildId" : "54DB4E3C4EC743FE95DD31C9D312E2898724577E" }, { "b" : "7F3805F6F000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "8EF0683858704EF173AB11B1E27076F37F82B7B6" } ] }}
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f746a214711]
[ReplicaSetFixture:job9:initsync]  mongod(+0x14F74A9) [0x7f746a2134a9]
[ReplicaSetFixture:job9:initsync]  mongod(+0x14F798D) [0x7f746a21398d]
[ReplicaSetFixture:job9:initsync]  libpthread.so.0(+0xF7E0) [0x7f74665227e0]
[ReplicaSetFixture:job9:initsync]  libc.so.6(gsignal+0x35) [0x7f74661b15e5]
[ReplicaSetFixture:job9:initsync]  libc.so.6(abort+0x175) [0x7f74661b2dc5]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x7f746951a4d0]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo10LockerImplILb0EE19assertEmptyAndResetEv+0xC8) [0x7f746980f2c8]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo20OperationContextImplD1Ev+0x34) [0x7f7469a368d4]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo20OperationContextImplD0Ev+0x11) [0x7f7469a36a01]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo4repl10TaskRunner9_runTasksEv+0x115) [0x7f7469d78965]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x135) [0x7f746a193095]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xC0) [0x7f746a193ce0]
[ReplicaSetFixture:job9:initsync]  mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x149) [0x7f746a194889]
[ReplicaSetFixture:job9:initsync]  mongod(execute_native_thread_routine+0x20) [0x7f746ac751a0]
[ReplicaSetFixture:job9:initsync]  libpthread.so.0(+0x7AA1) [0x7f746651aaa1]
[ReplicaSetFixture:job9:initsync]  libc.so.6(clone+0x6D) [0x7f7466267aad]
[ReplicaSetFixture:job9:initsync] -----  END BACKTRACE  -----



 Comments   
Comment by Githook User [ 19/Sep/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-25131: release resource in destructor
Branch: master
https://github.com/mongodb/mongo/commit/5c3d5f81ad21f072f4da71d87af1d149686126ef

Comment by Scott Hernandez (Inactive) [ 15/Sep/16 ]

I see the problem. We missed releasing resources in the destructor, since the network request tripped the validating failure while the collection clone was active.

Comment by David Storch [ 15/Sep/16 ]

The failed insert in the test that Judah attached is due to SERVER-25969, for which a fix is progress. After this fix, the attached repro will no longer trip the lock_state.cpp invariant.

Comment by Judah Schvimer [ 15/Sep/16 ]

It looks like it is still possible to hit this problem. Attached is a repro with decimal support in test.js.

Comment by Githook User [ 06/Sep/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-25252 SERVER-25131: enable blacklisted test.
Branch: master
https://github.com/mongodb/mongo/commit/27551f97e11e611f894fa1dfedf4d93c83962dd4

Comment by Githook User [ 06/Sep/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-25131: release collection/db locks on collection clone failure.
Branch: master
https://github.com/mongodb/mongo/commit/815e16eace8c40db7eed5ad3a6902027f1212e2a

Comment by Judah Schvimer [ 31/Aug/16 ]

The bug is that when the task scheduled on the TaskRunner for insertDocuments fails (here due to a document validation failure), it resets the OperationContext but does not release the locks. We want to release the OperationContext on task failures so we must release the locks.

Comment by Judah Schvimer [ 18/Jul/16 ]

To reproduce, mongorestore a node with the attached bypassdocvalidation.tar.gz directory while providing the --bypassDocumentValidation flag and run initial sync on another node.

The setupInitSync.sh script will do this for you with the untarred bypassdocvalidation.tar.gz dump directory.

Generated at Thu Feb 08 04:08:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.