[SERVER-34347] Batch write with ranged query predicate mixed with single target triggers invariant on mongos Created: 05/Apr/18  Updated: 29/Oct/23  Resolved: 27/Apr/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.2, 3.7.3
Fix Version/s: 3.6.5, 4.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File test.js    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-36061 Mongos crashes due to invariant failure Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Sharding 2018-04-23, Sharding 2018-05-07
Participants:

 Description   

This is because whenever a targeted write need to be sent to multiple shards, the endpoint shard version will be overriden to have ignored version. If the batch has a mixed of writes that are multi target and single targets, it will end up with endpoints that have the original versions and the one with ignored versions even though they are the same shards. Since the batchMap is keyed by the ShardEndPoint (which is the (shard, version) pair), it will miss this check and end up with a map with multiple entries for the same shard. This will in turn, trigger the invariant here.

0x7ff7f36919d1 0x7ff7f3690be9 0x7ff7f36910cd 0x7ff7f108c710 0x7ff7f0d1b925 0x7ff7f0d1d105 0x7ff7f2ac7e3a 0x7ff7f2c07596 0x7ff7f2c0a6bf 0x7ff7f2c176e0 0x7ff7f2bd892a 0x7ff7f2fc8c0f 0x7ff7f2bf70f8 0x7ff7f2bf7a43 0x7ff7f2bf8129 0x7ff7f2b19851 0x7ff7f2b35eca 0x7ff7f2b31a67 0x7ff7f2b34cb1 0x7ff7f2f89492 0x7ff7f2b308d0 0x7ff7f2b32e12 0x7ff7f2b3370b 0x7ff7f2b31aed 0x7ff7f2b34cb1 0x7ff7f2f899f5 0x7ff7f35424f4 0x7ff7f10849d1 0x7ff7f0dd1b5d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"7FF7F25C8000","o":"10C99D1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7FF7F25C8000","o":"10C8BE9"},{"b":"7FF7F25C8000","o":"10C90CD"},{"b":"7FF7F107D000","o":"F710"},{"b":"7FF7F0CE9000","o":"32925","s":"gsignal"},{"b":"7FF7F0CE9000","o":"34105","s":"abort"},{"b":"7FF7F25C8000","o":"4FFE3A","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"7FF7F25C8000","o":"63F596","s":"_ZN5mongo12BatchWriteOp11targetBatchERKNS_10NSTargeterEbPSt3mapINS_7ShardIdEPNS_18TargetedWriteBatchESt4lessIS5_ESaISt4pairIKS5_S7_EEE"},{"b":"7FF7F25C8000","o":"6426BF","s":"_ZN5mongo14BatchWriteExec12executeBatchEPNS_16OperationContextERNS_10NSTargeterERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseEPNS_19BatchWriteExecStatsE"},{"b":"7FF7F25C8000","o":"64F6E0","s":"_ZN5mongo13ClusterWriter5writeEPNS_16OperationContextERKNS_21BatchedCommandRequestEPNS_19BatchWriteExecStatsEPNS_22BatchedCommandResponseE"},{"b":"7FF7F25C8000","o":"61092A"},{"b":"7FF7F25C8000","o":"A00C0F","s":"_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE"},{"b":"7FF7F25C8000","o":"62F0F8"},{"b":"7FF7F25C8000","o":"62FA43"},{"b":"7FF7F25C8000","o":"630129","s":"_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE"},{"b":"7FF7F25C8000","o":"551851","s":"_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE"},{"b":"7FF7F25C8000","o":"56DECA","s":"_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE"},{"b":"7FF7F25C8000","o":"569A67","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"7FF7F25C8000","o":"56CCB1"},{"b":"7FF7F25C8000","o":"9C1492","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsE"},{"b":"7FF7F25C8000","o":"5688D0","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS0_9OwnershipE"},{"b":"7FF7F25C8000","o":"56AE12","s":"_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE"},{"b":"7FF7F25C8000","o":"56B70B","s":"_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE"},{"b":"7FF7F25C8000","o":"569AED","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"7FF7F25C8000","o":"56CCB1"},{"b":"7FF7F25C8000","o":"9C19F5"},{"b":"7FF7F25C8000","o":"F7A4F4"},{"b":"7FF7F107D000","o":"79D1"},{"b":"7FF7F0CE9000","o":"E8B5D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.6.3", "gitVersion" : "9586e557d54ef70f9ca4b43c26892cd55257e1a5", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.11.2.el6.x86_64", "version" : "#1 SMP Tue Mar 25 19:59:55 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "b" : "7FF7F25C8000", "elfType" : 3, "buildId" : "AC99B16D54D319C90ACB8665F80EDBBC6A4D5CFE" }, { "b" : "7FFF1EE95000", "elfType" : 3, "buildId" : "EE208590E0E20612DDABB0A20ACAFC7F9BD06E8E" }, { "b" : "7FF7F218C000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "A91A53E16DEABDFE05F28F7D04DAB5FFAA013767" }, { "b" : "7FF7F1F20000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "B85DA7CEDA43F9AAAEE7D61BE9799AAC9E845358" }, { "b" : "7FF7F1B40000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "683AB37AB40BB722B283F23F79EB9CE8B1DA7A93" }, { "b" : "7FF7F1938000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "B26528BF6C0636AC1CAE5AC50BDBC07E60851DF4" }, { "b" : "7FF7F1734000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "AFC7448F2F2F6ED4E5BC82B1BD8A7320B84A9D48" }, { "b" : "7FF7F14B0000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "98B028A725D6E93253F25DF00B794DFAA66A3145" }, { "b" : "7FF7F129A000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A44499D29B114A5366CD72DD4883958495AC1C1D" }, { "b" : "7FF7F107D000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "211321F78CA244BE2B2B1B8584B460F9933BA76B" }, { "b" : "7FF7F0CE9000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "91D435CBF18DF05134203B0A7F7919718DBD1D75" }, { "b" : "7FF7F23A6000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "57BF668F99B7F5917B8D55FBB645173C9A644575" }, { "b" : "7FF7F0AA5000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "50487A3480233636C29DBCAD5DE65421808948AB" }, { "b" : "7FF7F07BF000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "D9A44621797C990C639FF2D5AA452AB559C277DE" }, { "b" : "7FF7F05BB000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "6A22EDFF4D4F04A57573E3D1536B6B4963159CD5" }, { "b" : "7FF7F038F000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "D180B6297A9A302693053BD753A85D04A88DE811" }, { "b" : "7FF7F0179000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7FF7EFF6E000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "FF9705F60A59F28CA0FC50720A4F18FA9A889BD6" }, { "b" : "7FF7EFD6B000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7FF7EFB4C000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "BAD5C71361DADF259B6E306A49E6F47F24AEA3DC" }, { "b" : "7FF7EF93E000", "path" : "/lib64/libnss_files.so.2", "elfType" : 3, "buildId" : "C630B0C85ACDA367F3830B835AD6DFA0284E1E2B" }, { "b" : "7FF7EF738000", "path" : "/lib64/libnss_dns.so.2", "elfType" : 3, "buildId" : "225B32AACBE939FC4F115BA6058712B67E3B7285" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x41) [0x7ff7f36919d1]
 mongos(+0x10C8BE9) [0x7ff7f3690be9]
 mongos(+0x10C90CD) [0x7ff7f36910cd]
 libpthread.so.0(+0xF710) [0x7ff7f108c710]
 libc.so.6(gsignal+0x35) [0x7ff7f0d1b925]
 libc.so.6(abort+0x175) [0x7ff7f0d1d105]
 mongos(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x7ff7f2ac7e3a]
 mongos(_ZN5mongo12BatchWriteOp11targetBatchERKNS_10NSTargeterEbPSt3mapINS_7ShardIdEPNS_18TargetedWriteBatchESt4lessIS5_ESaISt4pairIKS5_S7_EEE+0xA06) [0x7ff7f2c07596]
 mongos(_ZN5mongo14BatchWriteExec12executeBatchEPNS_16OperationContextERNS_10NSTargeterERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseEPNS_19BatchWriteExecStatsE+0x23F) [0x7ff7f2c0a6bf]
 mongos(_ZN5mongo13ClusterWriter5writeEPNS_16OperationContextERKNS_21BatchedCommandRequestEPNS_19BatchWriteExecStatsEPNS_22BatchedCommandResponseE+0x550) [0x7ff7f2c176e0]
 mongos(+0x61092A) [0x7ff7f2bd892a]
 mongos(_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x1F) [0x7ff7f2fc8c0f]
 mongos(+0x62F0F8) [0x7ff7f2bf70f8]
 mongos(+0x62FA43) [0x7ff7f2bf7a43]
 mongos(_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE+0x59) [0x7ff7f2bf8129]
 mongos(_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x5A1) [0x7ff7f2b19851]
 mongos(_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE+0xBA) [0x7ff7f2b35eca]
 mongos(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x97) [0x7ff7f2b31a67]
 mongos(+0x56CCB1) [0x7ff7f2b34cb1]
 mongos(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsE+0x1A2) [0x7ff7f2f89492]
 mongos(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS0_9OwnershipE+0x150) [0x7ff7f2b308d0]
 mongos(_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE+0xAF2) [0x7ff7f2b32e12]
 mongos(_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE+0x23B) [0x7ff7f2b3370b]
 mongos(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x11D) [0x7ff7f2b31aed]
 mongos(+0x56CCB1) [0x7ff7f2b34cb1]
 mongos(+0x9C19F5) [0x7ff7f2f899f5]
 mongos(+0xF7A4F4) [0x7ff7f35424f4]
 libpthread.so.0(+0x79D1) [0x7ff7f10849d1]
 libc.so.6(clone+0x6D) [0x7ff7f0dd1b5d]
-----  END BACKTRACE  -----



 Comments   
Comment by Githook User [ 01/May/18 ]

Author:

{'email': 'golden.janna@gmail.com', 'name': 'jannaerin', 'username': 'jannaerin'}

Message: SERVER-34347 Create new batch when targeted writes batch includes same target with different shardVersion
Branch: v3.6
https://github.com/mongodb/mongo/commit/e3f09fb03596d1bcc9bd07e8e44b12a4c233205d

Comment by Githook User [ 30/Apr/18 ]

Author:

{'email': 'golden.janna@gmail.com', 'username': 'jannaerin', 'name': 'jannaerin'}

Message: SERVER-34347 Create new batch when targeted writes batch includes same target with different shardVersion
Branch: v3.6
https://github.com/mongodb/mongo/commit/e3f09fb03596d1bcc9bd07e8e44b12a4c233205d

Comment by Githook User [ 27/Apr/18 ]

Author:

{'email': 'golden.janna@gmail.com', 'username': 'jannaerin', 'name': 'jannaerin'}

Message: SERVER-34347 Create new batch when targeted writes batch includes same target with different shardVersion
Branch: master
https://github.com/mongodb/mongo/commit/04d40938faf18bc9158e783b28f6766881fd15f5

Comment by Randolph Tan [ 05/Apr/18 ]

This appear to only affect v3.6 and newer since the invariant did not exist in older versions.

Comment by Randolph Tan [ 05/Apr/18 ]

Attached test.js that demonstrates this issue

Generated at Thu Feb 08 04:36:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.