Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9858

After a chunk migration, requests on secondaries return multiple objects with same _id

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.3
    • Component/s: Replication, Sharding
    • Environment:
      Ubuntu 12.10, 3 replica set shards, each replica set contains 3 machines
    • Linux
    • Hide

      We can reproduce this bug everytime we try to retrieve a duplicated object, but we don't know how to reproduce the all situation.

      Show
      We can reproduce this bug everytime we try to retrieve a duplicated object, but we don't know how to reproduce the all situation.

      After a chunk migration, there are some objects that have been moved to another shard, that are still present on the old one.
      When we query these objects on a mongo shell, reading from the primaries, everything seems fine, but if we read from the secondaries, it appears that these objects are duplicated.

      Shell behavior:

      mongos> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      
      mongos> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 }).readPref('secondary')
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      

      As you can see, there are two objects that are exactly the same when reading from secondaries.
      Here are the same queries with explain():

      mongos> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 }).explain()
      {
      	"clusteredType" : "ParallelSort",
      	"shards" : {
      		"str1/str1-1.nitrogr.am:27017,str1-2.nitrogr.am:27017,str1-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 0,
      				"nscannedObjects" : 0,
      				"nscanned" : 0,
      				"nscannedObjectsAllPlans" : 0,
      				"nscannedAllPlans" : 0,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str1-1.nitrogr.am:27017"
      			}
      		],
      		"str2/str2-1.nitrogr.am:27017,str2-2.nitrogr.am:27017,str2-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 0,
      				"nscannedObjects" : 1,
      				"nscanned" : 1,
      				"nscannedObjectsAllPlans" : 1,
      				"nscannedAllPlans" : 1,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 1,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str2-3.nitrogr.am:27017"
      			}
      		],
      		"str3/str3-1.nitrogr.am:27017,str3-2.nitrogr.am:27017,str3-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 1,
      				"nscannedObjects" : 1,
      				"nscanned" : 1,
      				"nscannedObjectsAllPlans" : 1,
      				"nscannedAllPlans" : 1,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str3-2.nitrogr.am:27017"
      			}
      		]
      	},
      	"cursor" : "BtreeCursor mid_1",
      	"n" : 1,
      	"nChunkSkips" : 1,
      	"nYields" : 0,
      	"nscanned" : 2,
      	"nscannedAllPlans" : 2,
      	"nscannedObjects" : 2,
      	"nscannedObjectsAllPlans" : 2,
      	"millisShardTotal" : 0,
      	"millisShardAvg" : 0,
      	"numQueries" : 3,
      	"numShards" : 3,
      	"millis" : 3
      }
      
      mongos> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 }).readPref('secondary').explain()
      {
      	"clusteredType" : "ParallelSort",
      	"shards" : {
      		"str1/str1-1.nitrogr.am:27017,str1-2.nitrogr.am:27017,str1-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 0,
      				"nscannedObjects" : 0,
      				"nscanned" : 0,
      				"nscannedObjectsAllPlans" : 0,
      				"nscannedAllPlans" : 0,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str1-3.nitrogr.am:27017"
      			}
      		],
      		"str2/str2-1.nitrogr.am:27017,str2-2.nitrogr.am:27017,str2-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 1,
      				"nscannedObjects" : 1,
      				"nscanned" : 1,
      				"nscannedObjectsAllPlans" : 1,
      				"nscannedAllPlans" : 1,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str2-1.nitrogr.am:27017"
      			}
      		],
      		"str3/str3-1.nitrogr.am:27017,str3-2.nitrogr.am:27017,str3-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor mid_1",
      				"isMultiKey" : false,
      				"n" : 1,
      				"nscannedObjects" : 1,
      				"nscanned" : 1,
      				"nscannedObjectsAllPlans" : 1,
      				"nscannedAllPlans" : 1,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 0,
      				"indexBounds" : {
      					"mid" : [
      						[
      							"421587931535525217_30569368",
      							"421587931535525217_30569368"
      						]
      					]
      				},
      				"server" : "str3-1b.nitrogr.am:27017"
      			}
      		]
      	},
      	"cursor" : "BtreeCursor mid_1",
      	"n" : 2,
      	"nChunkSkips" : 0,
      	"nYields" : 0,
      	"nscanned" : 2,
      	"nscannedAllPlans" : 2,
      	"nscannedObjects" : 2,
      	"nscannedObjectsAllPlans" : 2,
      	"millisShardTotal" : 0,
      	"millisShardAvg" : 0,
      	"numQueries" : 3,
      	"numShards" : 3,
      	"millis" : 12
      }
      

      But there is another interesting thing: according to the previous queries, before the chunk migration, the object was on str2 and has been moved to str3. From what the first explained query returns, it is not present on str2:PRIMARY anymore. But here are the results of this same query on the replica set members:

      // str2-1 SECONDARY
      str2:SECONDARY> rs.slaveOk()
      str2:SECONDARY> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      
      // str2-2 SECONDARY
      str2:SECONDARY> rs.slaveOk()
      str2:SECONDARY> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      
      // str2-3 PRIMARY
      str2:PRIMARY> db.ig_medias.find({ mid: "421587931535525217_30569368" }, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("51939a0041002e238ff315d2"), "mid" : "421587931535525217_30569368", "cnt" : { "lke" : 12 } }
      

      The slt (sharding key) value of the previous object is 393022. According to sh.status(true):

      { "slt" : 392653 } -->> { "slt" : 393077 } on : str3 { "t" : 530, "i" : 0 } 
      

      When we try to retrieve all the objects from this chunk:

      mongos> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 }).readPref('primary').explain()
      {
      	"clusteredType" : "ParallelSort",
      	"shards" : {
      		"str3/str3-1.nitrogr.am:27017,str3-2.nitrogr.am:27017,str3-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor slt_1",
      				"isMultiKey" : false,
      				"n" : 43791,
      				"nscannedObjects" : 43791,
      				"nscanned" : 43791,
      				"nscannedObjectsAllPlans" : 43791,
      				"nscannedAllPlans" : 43791,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 59,
      				"nChunkSkips" : 0,
      				"millis" : 235,
      				"indexBounds" : {
      					"slt" : [
      						[
      							392653,
      							393077
      						]
      					]
      				},
      				"server" : "str3-2.nitrogr.am:27017"
      			}
      		]
      	},
      	"cursor" : "BtreeCursor slt_1",
      	"n" : 43791,
      	"nChunkSkips" : 0,
      	"nYields" : 59,
      	"nscanned" : 43791,
      	"nscannedAllPlans" : 43791,
      	"nscannedObjects" : 43791,
      	"nscannedObjectsAllPlans" : 43791,
      	"millisShardTotal" : 235,
      	"millisShardAvg" : 235,
      	"numQueries" : 1,
      	"numShards" : 1,
      	"indexBounds" : {
      		"slt" : [
      			[
      				392653,
      				393077
      			]
      		]
      	},
      	"millis" : 254
      }
      mongos> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 }).readPref('secondary').explain()
      {
      	"clusteredType" : "ParallelSort",
      	"shards" : {
      		"str3/str3-1.nitrogr.am:27017,str3-2.nitrogr.am:27017,str3-3.nitrogr.am:27017" : [
      			{
      				"cursor" : "BtreeCursor slt_1",
      				"isMultiKey" : false,
      				"n" : 43791,
      				"nscannedObjects" : 43791,
      				"nscanned" : 43791,
      				"nscannedObjectsAllPlans" : 43791,
      				"nscannedAllPlans" : 43791,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 0,
      				"nChunkSkips" : 0,
      				"millis" : 65,
      				"indexBounds" : {
      					"slt" : [
      						[
      							392653,
      							393077
      						]
      					]
      				},
      				"server" : "str3-3.nitrogr.am:27017"
      			}
      		]
      	},
      	"cursor" : "BtreeCursor slt_1",
      	"n" : 43791,
      	"nChunkSkips" : 0,
      	"nYields" : 0,
      	"nscanned" : 43791,
      	"nscannedAllPlans" : 43791,
      	"nscannedObjects" : 43791,
      	"nscannedObjectsAllPlans" : 43791,
      	"millisShardTotal" : 65,
      	"millisShardAvg" : 65,
      	"numQueries" : 1,
      	"numShards" : 1,
      	"indexBounds" : {
      		"slt" : [
      			[
      				392653,
      				393077
      			]
      		]
      	},
      	"millis" : 102
      }
      

      Everything looks fine, but here is the result of the same queries, on str2 machines:

      // str2-1 SECONDARY
      str2:SECONDARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("5121d413710ea50b420debbe"), "mid" : "347964548473999677_9543635", "cnt" : { "lke" : 29 } }
      { "_id" : ObjectId("512393f541002e238ff4a859"), "mid" : "390217151503894124_28517528", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("5131378341002e238f32144a"), "mid" : "383281284459672776_39604492", "cnt" : { "lke" : 24 } }
      { "_id" : ObjectId("5131fecc41002e238f3b943b"), "mid" : "238527467705375711_36829133", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512d19bc41002e238f07f548"), "mid" : "395567356503299509_279531956", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512c87de41002e238f0030d6"), "mid" : "369210183091623223_897180", "cnt" : { "lke" : 23 } }
      { "_id" : ObjectId("513a581d41002e238f94b99c"), "mid" : "407332724338286182_270796238", "cnt" : { "lke" : 10 } }
      { "_id" : ObjectId("512c396241002e238ffa969a"), "mid" : "398216305239938103_312516556", "cnt" : { "lke" : 18 } }
      { "_id" : ObjectId("515591ec41002e238f634c66"), "mid" : "422301314713088397_12787523", "cnt" : { "lke" : 13 } }
      { "_id" : ObjectId("512c614e41002e238ffd154a"), "mid" : "397802494119832758_313004538", "cnt" : { "lke" : 27 } }
      { "_id" : ObjectId("512e638541002e238f1974ad"), "mid" : "392182161891056449_306179277", "cnt" : { "lke" : 45 } }
      { "_id" : ObjectId("512f128741002e238f1dc17b"), "mid" : "401136119972715099_182925853", "cnt" : { "lke" : 22 } }
      { "_id" : ObjectId("5171d1b641002e238f630665"), "mid" : "190548751164070962_42200470", "cnt" : { "lke" : 6 } }
      { "_id" : ObjectId("513222e341002e238f421310"), "mid" : "364344438491857520_265458215", "cnt" : { "lke" : 12 } }
      { "_id" : ObjectId("5132b87241002e238f503297"), "mid" : "253952124281476138_47594148", "cnt" : { "lke" : 5 } }
      { "_id" : ObjectId("5132952041002e238f4c9046"), "mid" : "293163186055939480_174301408", "cnt" : { "lke" : 19 } }
      { "_id" : ObjectId("513336aa41002e238f55910a"), "mid" : "375606112677368687_199701635", "cnt" : { "lke" : 105 } }
      { "_id" : ObjectId("5136f27a41002e238f707851"), "mid" : "362900259412374239_4157811", "cnt" : { "lke" : 17 } }
      { "_id" : ObjectId("513c5cf241002e238fa0cb92"), "mid" : "351195940879395820_222487873", "cnt" : { "lke" : 46 } }
      { "_id" : ObjectId("513c97e641002e238fa22b02"), "mid" : "350248142066526553_245941897", "cnt" : { "lke" : 42 } }
      Type "it" for more
      str2:SECONDARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 }).count()
      40347
      
      // str2-2 SECONDARY
      str2:SECONDARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("5121d413710ea50b420debbe"), "mid" : "347964548473999677_9543635", "cnt" : { "lke" : 29 } }
      { "_id" : ObjectId("512393f541002e238ff4a859"), "mid" : "390217151503894124_28517528", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("5131fecc41002e238f3b943b"), "mid" : "238527467705375711_36829133", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512d19bc41002e238f07f548"), "mid" : "395567356503299509_279531956", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512c87de41002e238f0030d6"), "mid" : "369210183091623223_897180", "cnt" : { "lke" : 23 } }
      { "_id" : ObjectId("512e638541002e238f1974ad"), "mid" : "392182161891056449_306179277", "cnt" : { "lke" : 45 } }
      { "_id" : ObjectId("513a581d41002e238f94b99c"), "mid" : "407332724338286182_270796238", "cnt" : { "lke" : 10 } }
      { "_id" : ObjectId("512c396241002e238ffa969a"), "mid" : "398216305239938103_312516556", "cnt" : { "lke" : 18 } }
      { "_id" : ObjectId("512c614e41002e238ffd154a"), "mid" : "397802494119832758_313004538", "cnt" : { "lke" : 27 } }
      { "_id" : ObjectId("512f128741002e238f1dc17b"), "mid" : "401136119972715099_182925853", "cnt" : { "lke" : 22 } }
      { "_id" : ObjectId("5131378341002e238f32144a"), "mid" : "383281284459672776_39604492", "cnt" : { "lke" : 24 } }
      { "_id" : ObjectId("5171d1b641002e238f630665"), "mid" : "190548751164070962_42200470", "cnt" : { "lke" : 6 } }
      { "_id" : ObjectId("513222e341002e238f421310"), "mid" : "364344438491857520_265458215", "cnt" : { "lke" : 12 } }
      { "_id" : ObjectId("5132b87241002e238f503297"), "mid" : "253952124281476138_47594148", "cnt" : { "lke" : 5 } }
      { "_id" : ObjectId("5136f27a41002e238f707851"), "mid" : "362900259412374239_4157811", "cnt" : { "lke" : 17 } }
      { "_id" : ObjectId("5132952041002e238f4c9046"), "mid" : "293163186055939480_174301408", "cnt" : { "lke" : 19 } }
      { "_id" : ObjectId("513336aa41002e238f55910a"), "mid" : "375606112677368687_199701635", "cnt" : { "lke" : 105 } }
      { "_id" : ObjectId("513c5cf241002e238fa0cb92"), "mid" : "351195940879395820_222487873", "cnt" : { "lke" : 46 } }
      { "_id" : ObjectId("513c97e641002e238fa22b02"), "mid" : "350248142066526553_245941897", "cnt" : { "lke" : 42 } }
      { "_id" : ObjectId("513cea5241002e238fa32c6e"), "mid" : "349858663940795082_181939649", "cnt" : { "lke" : 54 } }
      Type "it" for more
      str2:SECONDARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 }).count()
      40347
      
      // str2-3 PRIMARY
      str2:PRIMARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 })
      { "_id" : ObjectId("5121d413710ea50b420debbe"), "mid" : "347964548473999677_9543635", "cnt" : { "lke" : 29 } }
      { "_id" : ObjectId("512393f541002e238ff4a859"), "mid" : "390217151503894124_28517528", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("518a036141002e238fa5f264"), "mid" : "328357796146382089_13860576", "cnt" : { "lke" : 55 } }
      { "_id" : ObjectId("515f6e8841002e238fab2ac7"), "mid" : "326812285631217041_20784622", "cnt" : { "lke" : 6 } }
      { "_id" : ObjectId("513a581d41002e238f94b99c"), "mid" : "407332724338286182_270796238", "cnt" : { "lke" : 10 } }
      { "_id" : ObjectId("5131fecc41002e238f3b943b"), "mid" : "238527467705375711_36829133", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512d19bc41002e238f07f548"), "mid" : "395567356503299509_279531956", "cnt" : { "lke" : 38 } }
      { "_id" : ObjectId("512c87de41002e238f0030d6"), "mid" : "369210183091623223_897180", "cnt" : { "lke" : 23 } }
      { "_id" : ObjectId("512e638541002e238f1974ad"), "mid" : "392182161891056449_306179277", "cnt" : { "lke" : 45 } }
      { "_id" : ObjectId("512c396241002e238ffa969a"), "mid" : "398216305239938103_312516556", "cnt" : { "lke" : 18 } }
      { "_id" : ObjectId("512c614e41002e238ffd154a"), "mid" : "397802494119832758_313004538", "cnt" : { "lke" : 27 } }
      { "_id" : ObjectId("517afcac880084345c36eb36"), "mid" : "442869895293265303_32677776", "cnt" : { "lke" : 3 } }
      { "_id" : ObjectId("512f128741002e238f1dc17b"), "mid" : "401136119972715099_182925853", "cnt" : { "lke" : 22 } }
      { "_id" : ObjectId("5131378341002e238f32144a"), "mid" : "383281284459672776_39604492", "cnt" : { "lke" : 24 } }
      { "_id" : ObjectId("513222e341002e238f421310"), "mid" : "364344438491857520_265458215", "cnt" : { "lke" : 12 } }
      { "_id" : ObjectId("5132b87241002e238f503297"), "mid" : "253952124281476138_47594148", "cnt" : { "lke" : 5 } }
      { "_id" : ObjectId("513336aa41002e238f55910a"), "mid" : "375606112677368687_199701635", "cnt" : { "lke" : 105 } }
      { "_id" : ObjectId("5132952041002e238f4c9046"), "mid" : "293163186055939480_174301408", "cnt" : { "lke" : 19 } }
      { "_id" : ObjectId("5136f27a41002e238f707851"), "mid" : "362900259412374239_4157811", "cnt" : { "lke" : 17 } }
      { "_id" : ObjectId("513c5cf241002e238fa0cb92"), "mid" : "351195940879395820_222487873", "cnt" : { "lke" : 46 } }
      Type "it" for more
      str2:PRIMARY> db.ig_medias.find({ slt: {$gte: 392653, $lt: 393077}}, { mid: 1, "cnt.lke": 1 }).count()
      40347
      

      After checking all chunks, we found out that 56/1196 of them contain duplicates.

            Assignee:
            Unassigned Unassigned
            Reporter:
            n1t0 Anthony MOI
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: