Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62272

Adding schema validation to a collection can prevent chunk migrations of failing documents

    • Fully Compatible
    • ALL
    • v6.0, v5.0, v4.4, v4.2
    • Sharding 2022-05-02, Sharding NYC 2022-05-16, Sharding NYC 2022-05-30, Sharding NYC 2022-06-13
    • 61
    • 3

      When Schema Validation is added to a collection then that collection is sharded, if any documents existed that would fail validation chunk migration will also fail.

      For example:

      # setup
      m 4.4.10-ent
      mlaunch init --replicaset --nodes 3 --shards 2 --binarypath $(m bin 4.4.10-ent)
      

      Using the above sharded cluster we will create a collection with some junk data, then add a validation rule to ensure no further documents would be written using the same schema:

      function junk(length) {
        var result           = '';
        var characters       = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
        var charactersLength = characters.length;
        for ( var i = 0; i < length; i++ ) {
           result += characters.charAt(Math.floor(Math.random() * charactersLength));
        }
        return result;
      }
      
      var data = []
      for (var i = 0; i < 50; i++) { 
        data.push({ i: i, d: junk(250 * 1024) });
      }
      db.getSiblingDB("test").c3.insertMany(data);
      db.getSiblingDB("test").runCommand({
        collMod: "c3",
        validator: { $jsonSchema: {
           bsonType: "object",
           properties: {
              i: {
                 bsonType: "string",
                 description: "must be a string"
              },        
           }
        } }  
      })
      

      If we try to write another document it should fail:

      db.getSiblingDB("test").c3.insert({ i: 1, d: junk(128) });
      /* 
      WriteResult({
      	"nInserted" : 0,
      	"writeError" : {
      		"code" : 121,
      		"errmsg" : "Document failed validation"
      	}
      })
      */
      

      If the collection was then sharded, sh.status() would simply show that no chunks were being migrated:

      // ensure lots of chunks
      db.getSiblingDB("config").settings.save( { _id:"chunksize", value: 1 } );
      sh.enableSharding("test");
      sh.shardCollection("test.c3", { _id: 1 } )
      

      Reviewing the other shard's PRIMARY log we can see the following:

      {"t":{"$date":"2021-12-24T14:46:38.501-05:00"},"s":"I",  "c":"MIGRATE",  "id":22000,   "ctx":"migrateThread","msg":"Starting receiving end of chunk migration","attr":{"chunkMin":{"_id":{"$minKey":1}},"chunkMax":{"_id":{"$oid":"61c6233d8715b5d2ac06168a"}},"namespace":"test.c3","fromShard":"shard01","epoch":{"$oid":"61c623657acfa1a7fd9bdaf7"},"sessionId":"shard01_shard02_61c6239e7acfa1a7fd9bdf60","migrationId":{"uuid":{"$uuid":"83e10e76-bd8d-4e65-8f5f-63755b6b8f49"}}}}
      {"t":{"$date":"2021-12-24T14:46:38.575-05:00"},"s":"I",  "c":"MIGRATE",  "id":21999,   "ctx":"chunkInserter","msg":"Batch insertion failed","attr":{"error":"DocumentValidationFailure: Insert of { _id: ObjectId('61c6233d8715b5d2ac061688'), i: 0.0, d: \"w0HKLtlN64tsZfOs2OnSb0eBGMIhjkPwSja6Td3hp5hMIPueIAgKV336r7KRGrnPwsXW2iItnPK94ccitWPPFWF3ogSMznGOFqCwJVR6EGpOD9R6d79f7Ztdbsl7bSNRv9J29mfDLlXLYRUDa8OTLv...\" } failed. :: caused by :: Document failed validation"}}
      {"t":{"$date":"2021-12-24T14:46:38.592-05:00"},"s":"I",  "c":"SHARDING", "id":22080,   "ctx":"migrateThread","msg":"About to log metadata event","attr":{"namespace":"changelog","event":{"_id":"Alexs-MacBook-Pro.local:27021-2021-12-24T14:46:38.592-05:00-61c6239e1bda207efdb22431","server":"Alexs-MacBook-Pro.local:27021","shard":"shard02","clientAddr":"","time":{"$date":"2021-12-24T19:46:38.592Z"},"what":"moveChunk.to","ns":"test.c3","details":{"min":{"_id":{"$minKey":1}},"max":{"_id":{"$oid":"61c6233d8715b5d2ac06168a"}},"step 1 of 7":1,"step 2 of 7":0,"step 3 of 7":55,"to":"shard02","from":"shard01","note":"aborted"}}}}
      {"t":{"$date":"2021-12-24T14:46:38.614-05:00"},"s":"I",  "c":"MIGRATE",  "id":21998,   "ctx":"migrateThread","msg":"Error during migration","attr":{"error":"migrate failed: Location51008: _migrateClone failed:  :: caused by :: operation was interrupted"}}
      

      This can be addressed by logging into each shard's PRIMARY and running collMod to set validationLevel: "off", however as these moveChunk.error aren't logged in the changelog it's not obvious what has failed without a deeper dive.

            Assignee:
            nandini.bhartiya@mongodb.com Nandini Bhartiya
            Reporter:
            alex.bevilacqua@mongodb.com Alex Bevilacqua
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: