[SERVER-26102] set shardVersion to IGNORED for system.indexes, but only if a shardVersion was sent in the command Created: 13/Sep/16  Updated: 19/Nov/16  Resolved: 04/Oct/16

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.3.12
Fix Version/s: 3.4.0-rc1

Type: Bug Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2016-10-10
Participants:
Linked BF Score: 0

 Description   

Did not repro on 3.3.11 and the failure was first seen on August 17, so this is probably introduced in mid August.

d20000| 2016-09-13T18:01:39.641-0400 I ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to my-macbookpro.local:20003
d20000| 2016-09-13T18:01:39.642-0400 I SHARDING [conn3] Marking collection as sharded with collection version: 1|0||57d87743d0faf3d986b61413, shard version: 1|0||57d87743d0faf3d986b61413
d20000| 2016-09-13T18:01:39.675-0400 I INDEX    [conn5] build index on: test.test properties: { v: 2, key: { x: 1.0 }, name: "x_1", ns: "test.test", background: true }
d20000| 2016-09-13T18:01:39.675-0400 I INDEX    [conn5] build index done.  scanned 0 total records. 0 secs
d20000| 2016-09-13T18:01:39.676-0400 I -        [conn5] Fatal Assertion 28819 at src/mongo/db/catalog/cursor_manager.cpp 331
d20000| 2016-09-13T18:01:39.676-0400 I -        [conn5] 
d20000| 
d20000| ***aborting after fassert() failure
d20000|  mongod(_ZN5mongo25fassertFailedWithLocationEiPKcj+0x281) [0x10c56e121]
d20000|  mongod(_ZN5mongo13CursorManager13invalidateAllEbRKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE+0x602) [0x10bc33762]
d20000|  mongod(_ZN5mongo12IndexCatalog10_dropIndexEPNS_16OperationContextEPNS_17IndexCatalogEntryE+0x244) [0x10bc4aed4]
d20000|  mongod(_ZN5mongo12IndexCatalog15IndexBuildBlock4failEv+0x4E) [0x10bc4ac0e]
d20000|  mongod(_ZN5mongo15MultiIndexBlockD2Ev+0x2A9) [0x10bc516f9]
d20000|  mongod(_ZN5mongo14CmdCreateIndex3runEPNS_16OperationContextERKNSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEERNS_7BSONObjEiRS9_RNS_14BSONObjBuilderE+0x264E) [0x10bc80a3e]
d20000|  mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x3F6) [0x10bc84656]
d20000|  mongod(_ZN5mongo14performInsertsEPNS_16OperationContextERKNS_8InsertOpE+0x1CF3) [0x10bf18e23]
d20000|  mongod(_ZN5mongo9CmdInsert7runImplEPNS_16OperationContextERKNSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEERKNS_7BSONObjERNS_14BSONObjBuilderE+0x4D) [0x10bd1194d]
d20000|  mongod(_ZN5mongo12_GLOBAL__N_112WriteCommand3runEPNS_16OperationContextERKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEERNS_7BSONObjEiRSA_RNS_14BSONObjBuilderE+0x20) [0x10bd10850]
d20000|  mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x3F6) [0x10bc84656]
d20000|  mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xB14) [0x10bc83814]
d20000|  mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x23C) [0x10c21929c]
d20000|  mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1496) [0x10be22bb6]
d20000|  mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopEPNS_9transport7SessionE+0x17A) [0x10bafa56a]
d20000|  mongod(_ZNSt3__110__function6__funcIZN5mongo23ServiceEntryPointMongod12startSessionEONS2_9transport7SessionEE3$_0NS_9allocatorIS7_EEFvPS5_EEclEOSA_+0x1B) [0x10bafadfb]
d20000|  mongod(_ZN5mongo12_GLOBAL__N_17runFuncEPv+0x298) [0x10c567ea8]
d20000|  mongod(_ZNSt3__114__thread_proxyINS_5tupleIJNS_6__bindIRFPvS3_EJPN5mongo12_GLOBAL__N_17ContextEEEEEEEEES3_S3_+0x61) [0x10c5684f1]
d20000|  libsystem_pthread.dylib(_pthread_body+0xB4) [0x7fffb5badabb]
d20000|  libsystem_pthread.dylib(_pthread_body+0x0) [0x7fffb5bada07]
d20000|  libsystem_pthread.dylib(thread_start+0xD) [0x7fffb5bad231]



 Comments   
Comment by Githook User [ 04/Oct/16 ]

Author:

{u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}

Message: SERVER-26102 set shardVersion to IGNORED for system.indexes, but only if a shardVersion was sent in the command
Branch: master
https://github.com/mongodb/mongo/commit/e8345ff602fc528dbc33944b21bd344cbea5e3e5

Comment by Tess Avitabile (Inactive) [ 04/Oct/16 ]

In case it helps, I was able to see the shard version mismatch error produced here by surrounding the call to the op observer in a try-catch and printing the error message when I ran Robert's repro.

Comment by Tess Avitabile (Inactive) [ 03/Oct/16 ]

This failure is introduced by this block in this commit. In legacy write mode, the mongos attaches a shard version when forwarding the createIndex command. Then when we check the shard version while recording the index creation in the op observer, we fail because we expected a shard version of zero. This in turn causes us to throw before we can finish cleaning up the MultiIndexBlock.

esha.maharishi, do you have thoughts on how to address this? Should the mongos not be attaching a shard version when forwarding the createIndex command in legacy write mode?

Generated at Thu Feb 08 04:11:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.