[SERVER-21644] Invariant failure iter != _lockMap.end() in legacy_dist_lock_manager.cpp Created: 23/Nov/15  Updated: 25/Jan/17  Resolved: 08/Dec/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.2.1, 3.3.0

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File fsm_all_sharded_replication_legacy_config_servers_with_balancer.log    
Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Sharding D (12/11/15)
Participants:
Linked BF Score: 0

 Description   

https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_concurrency_sharded_WT_e3cd63fcae3deb1140941a51c85564f098062a23_15_11_23_18_12_36

This was uncovered by the fsm_all_sharded_replication_legacy_config_servers_with_balancer.js. It crashed with the following exception:

2015-11-23T18:45:12.809 0000 d20013| 2015-11-23T18:45:12.740 0000 I -        [conn86] Invariant failure iter != _lockMap.end() src/mongo/s/catalog/legacy/legacy_dist_lock_manager.cpp 182
 2015-11-23T18:45:12.809 0000 d20013| 2015-11-23T18:45:12.740 0000 I -        [conn86]
 ...
 2015-11-23T18:45:12.811 0000 d20013|  0x137f942 0x137e889 0x137f092 0x3aa100f790 0x3aa0c32625 0x3aa0c33e05 0x130774b 0x11c1590 0x118fd18 0xfe6d8b 0x126b3f7 0xba4eea 0xba5b86 0xaff3d0 0xcb921d 0x98a86c 0x132b9ed 0x3aa1007a51 0x3aa0ce893d
 2015-11-23T18:45:12.811 0000 d20013| ----- BEGIN BACKTRACE -----
 2015-11-23T18:45:12.822 0000 d20013| {"backtrace":[{"b":"400000","o":"F7F942"},{"b":"400000","o":"F7E889"},{"b":"400000","o":"F7F092"},{"b":"3AA1000000","o":"F790"},{"b":"3AA0C00000","o":"32625"},{"b":"3AA0C00000","o":"33E05"},{"b":"400000","o":"F0774B"},{"b":"400000","o":"DC1590"},{"b":"400000","o":"D8FD18"},{"b":"400000","o":"BE6D8B"},{"b":"400000","o":"E6B3F7"},{"b":"400000","o":"7A4EEA"},{"b":"400000","o":"7A5B86"},{"b":"400000","o":"6FF3D0"},{"b":"400000","o":"8B921D"},{"b":"400000","o":"58A86C"},{"b":"400000","o":"F2B9ED"},{"b":"3AA1000000","o":"7A51"},{"b":"3AA0C00000","o":"E893D"}],"processInfo":{ "mongodbVersion" : "3.2.0-rc3-103-ge3cd63f", "gitVersion" : "e3cd63fcae3deb1140941a51c85564f098062a23", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "2.6.32-220.el6.x86_64", "version" : "#1 SMP Wed Nov 9 08:03:13 EST 2011", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "CE2EBCEACDDF77CDBBC11D3208BF858DEBC7137B" }, { "b" : "7FFF87AFF000", "elfType" : 3, "buildId" : "08F634A1D22DEFF00461D50A7699DACDC97657BF" }, { "b" : "7FCAEB3F2000", "path" : "/usr/lib64/libnetsnmpagent.so.20", "elfType" : 3, "buildId" : "E4E49DE2554F02ACF2728D1748874101B0709B3A" }, { "b" : "7FCAEB1CB000", "path" : "/usr/lib64/libnetsnmphelpers.so.20", "elfType" : 3, "buildId" : "17A35AEE324676929C7A5C8B4CE54443ED10AC07" }, { "b" : "7FCAEAD03000", "path" : "/usr/lib64/libnetsnmpmibs.so.20", "elfType" : 3, "buildId" : "78A49421FA60389F8C774BE68F5EF17DF2BD9CE3" }, { "b" : "7FCAEAA29000", "path" : "/usr/lib64/libnetsnmp.so.20", "elfType" : 3, "buildId" : "4CB6272BCAC2270393F559F67E8ED321690F79D5" }, { "path" : "/usr/lib64/libsasl2.so.2", "elfType" : 3, "buildId" : "E0AEE889D5BF1373F2F9EE0D448DBF3F5B5113F0" }, { "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "0C72521270790A1BD52C8F6B989EEA5A575085BF" }, { "b" : "7FCAEA7BC000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "93610457BCF424BEBBF1F3FB44E51B51B50F2B55" }, { "b" : "7FCAEA3D8000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "06DDBB192AF74F99DB58F2150BFB83F42F5EBAD3" }, { "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "58C5A5FF5C82D7BE3113BE36DD87C7004E3C4DB1" }, { "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "B5AE05CEDC0CE917F50A3A468CFA2ACD8592E8F6" }, { "path" : "/usr/lib64/libstdc  .so.6", "elfType" : 3, "buildId" : "28AF9321EBEA9D172CA43E11A60E02D0F7014870" }, { "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "989FE3A42CA8CEBDCC185A743896F23A0CF537ED" }, { "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "2AC15B051D1B8B53937E3341EA931D0E96F745D9" }, { "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D467973C46E563CDCF64B5F12B2D6A50C7A25BA1" }, { "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "A6D15926E61580E250ED91F84FF7517F3970CD83" }, { "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "04202A4A8BE624D2193E812A25589E2DD02D5B5C" }, { "b" : "7FCAEA1CB000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "8C0C7CAB7F028E4592A8581EB2122FBECAB26B97" }, { "b" : "7F904765F000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "0A8E7D74369C1AF1F7C33B8DF8387DE5013898A4" }, { "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "F704FA7D21D05EF31E90FB4890FCA7F3D91DA138" }, { "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "CAD1498B2AA3531958C579F5CB39D8D6BFB5675B" }, { "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "128802B73016BE233837EA9F2DCBC2153ACC2D6A" }, { "b" : "7F904845B000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "565D9CDC6BD59EFE0156BAFE21033BE070F014DA" }, { "b" : "7F9042DEF000", "path" : "/usr/lib64/librpm.so.1", "elfType" : 3, "buildId" : "0B73153AA2E650B19153B7E8A57F9C7A965072CD" }, { "path" : "/usr/lib64/librpmio.so.1", "elfType" : 3, "buildId" : "7D821C87BEF03F9D7BBFE7FEE591EC5929D1C22C" }, { "b" : "7F9046BE6000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "E7B49911F1136073DD7DC58E8118CD9A4FBE2A19" }, { "b" : "7F90481CF000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7FCAE93BF000", "path" : "/usr/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "6855E5BF5B3634C15F01B1043BD892D727EE3C08" }, { "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "DC11D5D89BDC77FF242481122D51E5A08DB60DA8" }, { "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "15782495E3AF093E67DDAE9A86436FFC6B3CC4D3" }, { "b" : "7F90469BA000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "13FFCD68952B7715DDF34C9321D82E3041EA9006" }, { "b" : "7F90427AE000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "44A3A1C1891B4C8170C3DB80E7117A022E5EECD0" }, { "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "3BCCABE75DC61BBA81AAE45D164E26EF4F9F55DB" }, { "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "58BAC04A1DB3964A8F594EFFBE4838AD01214EDC" }, { "path" : "/usr/lib64/libnss3.so", "elfType" : 3, "buildId" : "A719876DB720919EA694995B0CB4E703E78F561F" }, { "b" : "7F904599C000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "1250B1D041DD7552F0C870BB188DC3A34DF2651D" }, { "b" : "7F9045385000", "path" : "/usr/lib64/libelf.so.1", "elfType" : 3, "buildId" : "1C2B39A5003E9DA8FD9C55972C06245E731E6546" }, { "path" : "/usr/lib64/liblzma.so.0", "elfType" : 3, "buildId" : "6FF9BAEEEE9DDEEF2DFA5CBD36147A75891C0AD4" }, { "b" : "7F9042558000", "path" : "/usr/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "6BDB4E1990D6EBA12A5C8D39A7650DB8798BF568" }, { "b" : "7F9046738000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "2D0F26E648D9661ABD83ED8B4BBE8F2CFA50393B" }, { "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "A436538388F1F25113FDA834CA2EED524EFA17D6" }, { "b" : "7F9042930000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "26CC708AC7C0FC1797A2340C024F0ADD0CE054D8" }, { "path" : "/lib64/libdb-4.7.so", "elfType" : 3, "buildId" : "437CA0AB593A7383FF1A1700D14AF4998FE93CF3" }, { "path" : "/usr/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "3D18834CC92D576DCB1CD0F44BA62D3BFFFD52B7" }, { "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "C53F8B39797A277F40F582D8D11D3C2FFF7E5D1E" }, { "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "97F07716D324E086D43CC4D05873E1A16E020468" }, { "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "7CD7DD1B6C294C61F494519CE3E0D7E114DFB36D" }, { "b" : "7F9044728000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "8EF0683858704EF173AB11B1E27076F37F82B7B6" }, { "b" : "7FCAE811A000", "path" : "/usr/lib64/sasl2/libsasldb.so", "elfType" : 3, "buildId" : "4514552B5354286A143770420B38F2D5985D7FA1" }, { "b" : "7FCAE7F15000", "path" : "/usr/lib64/sasl2/libanonymous.so", "elfType" : 3, "buildId" : "EEAA33A75735D35F4BF25C3C2830B8C90ABDD8B5" }, { "b" : "7FCAE7D08000", "path" : "/usr/lib64/sasl2/libdigestmd5.so", "elfType" : 3, "buildId" : "34D8E3E2565DEF4A685D6976831B0372AD456993" }, { "b" : "7FCAE7B02000", "path" : "/usr/lib64/sasl2/libcrammd5.so", "elfType" : 3, "buildId" : "4CC7E695963F5C8B772EDFF456DB67F89E58FBD6" }, { "b" : "7FCAE78FD000", "path" : "/usr/lib64/sasl2/libplain.so", "elfType" : 3, "buildId" : "F8DDC7A3CA1CE5B75719AE0DC821647B609D17B6" }, { "b" : "7FCAE76F8000", "path" : "/usr/lib64/sasl2/liblogin.so", "elfType" : 3, "buildId" : "9D19F93E342AA4EE2D646E64642625F365056E5C" }, { "b" : "7FCAE74F0000", "path" : "/usr/lib64/sasl2/libgssapiv2.so", "elfType" : 3, "buildId" : "F7BCE9C6BFF4EAF0CB3142B299CF22D094CE4F04" } ] }}
 2015-11-23T18:45:12.822 0000 d20013|  mongod(mongo::printStackTrace(std::ostream&) 0x32) [0x137f942]
 2015-11-23T18:45:12.823 0000 d20013|  mongod( 0xF7E889) [0x137e889]
 2015-11-23T18:45:12.823 0000 d20013|  mongod( 0xF7F092) [0x137f092]
 2015-11-23T18:45:12.823 0000 d20013|  libpthread.so.0( 0xF790) [0x3aa100f790]
 2015-11-23T18:45:12.823 0000 d20013|  libc.so.6(gsignal 0x35) [0x3aa0c32625]
 2015-11-23T18:45:12.823 0000 d20013|  libc.so.6(abort 0x175) [0x3aa0c33e05]
 2015-11-23T18:45:12.823 0000 d20013|  mongod(mongo::invariantFailed(char const*, char const*, unsigned int) 0xCB) [0x130774b]
 2015-11-23T18:45:12.824 0000 d20013|  mongod(mongo::LegacyDistLockManager::unlock(mongo::OperationContext*, mongo::OID const&) 0x320) [0x11c1590]
 2015-11-23T18:45:12.824 0000 d20013|  mongod(mongo::ForwardingCatalogManager::ScopedDistLock::~ScopedDistLock() 0x48) [0x118fd18]
 2015-11-23T18:45:12.824 0000 d20013|  mongod(mongo::StatusWith<mongo::ForwardingCatalogManager::ScopedDistLock>::~StatusWith() 0x1B) [0xfe6d8b]
 2015-11-23T18:45:12.824 0000 d20013|  mongod(mongo::SplitChunkCommand::run(mongo::OperationContext*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&) 0x2CC7) [0x126b3f7]
 2015-11-23T18:45:12.825 0000 d20013|  mongod(mongo::Command::run(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x40A) [0xba4eea]
 2015-11-23T18:45:12.825 0000 d20013|  mongod(mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x3E6) [0xba5b86]
 2015-11-23T18:45:12.825 0000 d20013|  mongod(mongo::runCommands(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x1F0) [0xaff3d0]
 2015-11-23T18:45:12.826 0000 d20013|  mongod(mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) 0xC2D) [0xcb921d]
 2015-11-23T18:45:12.826 0000 d20013|  mongod(mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) 0xEC) [0x98a86c]
 2015-11-23T18:45:12.826 0000 d20013|  mongod(mongo::PortMessageServer::handleIncomingMsg(void*) 0x26D) [0x132b9ed]
 2015-11-23T18:45:12.826 0000 d20013|  libpthread.so.0( 0x7A51) [0x3aa1007a51]
 2015-11-23T18:45:12.826 0000 d20013|  libc.so.6(clone 0x6D) [0x3aa0ce893d]
 2015-11-23T18:45:12.826 0000 d20013| -----  END BACKTRACE  -----



 Comments   
Comment by Githook User [ 10/Dec/15 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-21644 Invariant failure iter != _lockMap.end() in legacy_dist_lock_manager.cpp

(cherry picked from commit 4fd3712ce592f895cea5aff2209e4d26f5cf3a93)
Branch: v3.2
https://github.com/mongodb/mongo/commit/205797cbd42ec4ec29954ee4442f86f7def58107

Comment by Randolph Tan [ 08/Dec/15 ]

ramon.fernandez We should backport this to v3.2 as this makes the distributed lock for legacy 3 config servers less prone to race conditions.

Comment by Githook User [ 08/Dec/15 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-21644 Invariant failure iter != _lockMap.end() in legacy_dist_lock_manager.cpp
Branch: master
https://github.com/mongodb/mongo/commit/4fd3712ce592f895cea5aff2209e4d26f5cf3a93

Comment by Randolph Tan [ 03/Dec/15 ]

Correction on last comment:

The OID generated is probably different. The logs show the same OID only because it is derived from the OID of the current lock document. This is fine based on the assumption that everything works perfectly and no 2 entity can own the lock at the same time. And this invariant was somehow violated and the same OID displayed on the log is just a side effect of the same lock being taken by 2 different participants.

Comment by Kaloian Manassiev [ 23/Nov/15 ]

Looks like two threads managed to take the same distributed lock at the same time, because the OID generated was the same:

[js_test:fsm_all_sharded_replication_legacy_config_servers_with_balancer] 2015-11-23T18:45:12.806+0000 d20013| 2015-11-23T18:45:12.739+0000 I SHARDING [conn86] distributed lock 'db10.coll10/ip-10-99-163-247:20013:1448304277:68810772' acquired for 'splitting chunk [{ tid: MinKey }, { tid: MaxKey }) in db10.coll10', ts : 56535eb86202d0bae254633a
...
[js_test:fsm_all_sharded_replication_legacy_config_servers_with_balancer] 2015-11-23T18:45:12.806+0000 d20013| 2015-11-23T18:45:12.739+0000 I SHARDING [conn41] distributed lock 'db10.coll10/ip-10-99-163-247:20013:1448304277:68810772' acquired for 'splitting chunk [{ tid: MinKey }, { tid: MaxKey }) in db10.coll10', ts : 56535eb86202d0bae254633a

Generated at Thu Feb 08 03:57:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.