[SERVER-20625] mapreduce.shardedfinish command can trigger invariant if the ShardRegistry is out of date Created: 24/Sep/15  Updated: 25/Jan/17  Resolved: 25/Sep/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.1.9

Type: Bug Priority: Major - P3
Reporter: J Rassi Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-20644 Invariant failure in sharding_initial... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding A (10/09/15)
Participants:

 Description   

An invariant failure in sharding_connection_hook.cpp can be triggered by running a mapReduce against an unsharded collection through mongos with the {out: {replace: ...}} option. Affects development branch only.

Reproduce with:

python buildscripts/resmoke.py --executor=sharding_jscore_passthrough --repeat=5 jstests/core/mr_replaceIntoDB.js

Link to example test failure: task, logs.

Excerpt:

[ShardedClusterFixture:job0:shard0] 2015-09-24T15:55:24.465-0400 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:60073 #14 (9 connections now open)
[ShardedClusterFixture:job0:shard0] 2015-09-24T15:55:24.466-0400 I -        [conn9] Invariant failure shard src/mongo/s/client/sharding_connection_hook.cpp 100
[ShardedClusterFixture:job0:shard0] 2015-09-24T15:55:24.466-0400 I -        [conn9]
[ShardedClusterFixture:job0:shard0] 
[ShardedClusterFixture:job0:shard0] ***aborting after invariant() failure
[ShardedClusterFixture:job0:shard0] 
[ShardedClusterFixture:job0:shard0] 
[ShardedClusterFixture:job0:shard0] 2015-09-24T15:55:24.484-0400 F -        [conn9] Got signal: 6 (Aborted).
[ShardedClusterFixture:job0:shard0] 
[ShardedClusterFixture:job0:shard0]  0x1264862 0x1263779 0x1263f82 0x7f2f01256b10 0x7f2f00f21265 0x7f2f00f22d10 0x11fde2b 0x11434eb 0x964bb9 0x95d147 0x95e278 0x114221b 0x954d1d 0x956208 0x95677e 0x113af57 0x9863c9 0xab92e4 0xb16ca7 0xb17b7d 0xa6ccbd 0xc2fe96 0x8ffe7d 0x121ef55 0x7f2f0124e73d 0x7f2f00fc4d1d
[ShardedClusterFixture:job0:shard0] ----- BEGIN BACKTRACE -----
[ShardedClusterFixture:job0:shard0] {"backtrace":[{"b":"400000","o":"E64862"},{"b":"400000","o":"E63779"},{"b":"400000","o":"E63F82"},{"b":"7F2F01248000","o":"EB10"},{"b":"7F2F00EF1000","o":"30265"},{"b":"7F2F00EF1000","o":"31D10"},{"b":"400000","o":"DFDE2B"},{"b":"400000","o":"D434EB"},{"b":"400000","o":"564BB9"},{"b":"400000","o":"55D147"},{"b":"400000","o":"55E278"},{"b":"400000","o":"D4221B"},{"b":"400000","o":"554D1D"},{"b":"400000","o":"556208"},{"b":"400000","o":"55677E"},{"b":"400000","o":"D3AF57"},{"b":"400000","o":"5863C9"},{"b":"400000","o":"6B92E4"},{"b":"400000","o":"716CA7"},{"b":"400000","o":"717B7D"},{"b":"400000","o":"66CCBD"},{"b":"400000","o":"82FE96"},{"b":"400000","o":"4FFE7D"},{"b":"400000","o":"E1EF55"},{"b":"7F2F01248000","o":"673D"},{"b":"7F2F00EF1000","o":"D3D1D"}],"processInfo":{ "mongodbVersion" : "3.1.9-pre-", "gitVersion" : "38f6c23578438a2d67c0846f9623740513829c33", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.3.1.el6.x86_64", "version" : "#1 SMP Fri Jan 3 21:39:27 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFB2A7E000", "elfType" : 3 }, { "b" : "7F2F01DF8000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7F2F01BF4000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7F2F018F4000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F2F01671000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7F2F01463000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F2F01248000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7F2F00EF1000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7F2F02001000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1264862]
[ShardedClusterFixture:job0:shard0]  mongod(+0xE63779) [0x1263779]
[ShardedClusterFixture:job0:shard0]  mongod(+0xE63F82) [0x1263f82]
[ShardedClusterFixture:job0:shard0]  libpthread.so.0(+0xEB10) [0x7f2f01256b10]
[ShardedClusterFixture:job0:shard0]  libc.so.6(gsignal+0x35) [0x7f2f00f21265]
[ShardedClusterFixture:job0:shard0]  libc.so.6(abort+0x110) [0x7f2f00f22d10]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xCB) [0x11fde2b]
[ShardedClusterFixture:job0:shard0]  mongod(+0xD434EB) [0x11434eb]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo20DBClientWithCommands22runCommandWithMetadataENS_10StringDataES1_RKNS_7BSONObjES4_+0xF9) [0x964bb9]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo20DBClientWithCommands10runCommandERKSsRKNS_7BSONObjERS3_i+0x1A7) [0x95d147]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo18DBClientConnection10runCommandERKSsRKNS_7BSONObjERS3_i+0x18) [0x95e278]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo22ShardingConnectionHook8onCreateEPNS_12DBClientBaseE+0x5BB) [0x114221b]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo16DBConnectionPool8onCreateEPNS_12DBClientBaseE+0x2D) [0x954d1d]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo16DBConnectionPool13_finishCreateERKSsdPNS_12DBClientBaseE+0x128) [0x956208]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo16DBConnectionPool3getERKSsd+0x16E) [0x95677e]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo15ShardConnectionC2ERKNS_16ConnectionStringERKSsSt10shared_ptrINS_12ChunkManagerEE+0x297) [0x113af57]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo27ParallelSortClusteredCursor8_oldInitEv+0x599) [0x9863c9]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo2mr22MapReduceFinishCommand3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderE+0xAF4) [0xab92e4]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x2B7) [0xb16ca7]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0x4ED) [0xb17b7d]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x1AD) [0xa6ccbd]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xC26) [0xc2fe96]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0xDD) [0x8ffe7d]
[ShardedClusterFixture:job0:shard0]  mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x265) [0x121ef55]
[ShardedClusterFixture:job0:shard0]  libpthread.so.0(+0x673D) [0x7f2f0124e73d]
[ShardedClusterFixture:job0:shard0]  libc.so.6(clone+0x6D) [0x7f2f00fc4d1d]
[ShardedClusterFixture:job0:shard0] -----  END BACKTRACE  -----

git bisect points to 75f185e8 as the first bad commit.



 Comments   
Comment by Githook User [ 25/Sep/15 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-20625 SERVER-20644 Don't crash if trying to talking to a server that the ShardRegistry doesn't know about
Branch: master
https://github.com/mongodb/mongo/commit/2f2886647d6340ca4b364599e5a455442e8d5656

Comment by Githook User [ 24/Sep/15 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-20625 Need to refresh ShardRegistry in mapreduce.shardedfinish command if the mongod isn't aware of any of the shards it needs to talk to
Branch: master
https://github.com/mongodb/mongo/commit/b332a0f555e0d332d3b3d77878187597a23e140b

Generated at Thu Feb 08 03:54:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.