-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 5.1.0, 4.4.10
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
v6.0, v5.0, v4.4, v4.2
-
Sharding 2022-05-02, Sharding NYC 2022-05-16, Sharding NYC 2022-05-30, Sharding NYC 2022-06-13
-
(copied to CRM)
-
61
-
3
When Schema Validation is added to a collection then that collection is sharded, if any documents existed that would fail validation chunk migration will also fail.
For example:
# setup
m 4.4.10-ent
mlaunch init --replicaset --nodes 3 --shards 2 --binarypath $(m bin 4.4.10-ent)
Using the above sharded cluster we will create a collection with some junk data, then add a validation rule to ensure no further documents would be written using the same schema:
function junk(length) { var result = ''; var characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'; var charactersLength = characters.length; for ( var i = 0; i < length; i++ ) { result += characters.charAt(Math.floor(Math.random() * charactersLength)); } return result; } var data = [] for (var i = 0; i < 50; i++) { data.push({ i: i, d: junk(250 * 1024) }); } db.getSiblingDB("test").c3.insertMany(data); db.getSiblingDB("test").runCommand({ collMod: "c3", validator: { $jsonSchema: { bsonType: "object", properties: { i: { bsonType: "string", description: "must be a string" }, } } } })
If we try to write another document it should fail:
db.getSiblingDB("test").c3.insert({ i: 1, d: junk(128) }); /* WriteResult({ "nInserted" : 0, "writeError" : { "code" : 121, "errmsg" : "Document failed validation" } }) */
If the collection was then sharded, sh.status() would simply show that no chunks were being migrated:
// ensure lots of chunks db.getSiblingDB("config").settings.save( { _id:"chunksize", value: 1 } ); sh.enableSharding("test"); sh.shardCollection("test.c3", { _id: 1 } )
Reviewing the other shard's PRIMARY log we can see the following:
{"t":{"$date":"2021-12-24T14:46:38.501-05:00"},"s":"I", "c":"MIGRATE", "id":22000, "ctx":"migrateThread","msg":"Starting receiving end of chunk migration","attr":{"chunkMin":{"_id":{"$minKey":1}},"chunkMax":{"_id":{"$oid":"61c6233d8715b5d2ac06168a"}},"namespace":"test.c3","fromShard":"shard01","epoch":{"$oid":"61c623657acfa1a7fd9bdaf7"},"sessionId":"shard01_shard02_61c6239e7acfa1a7fd9bdf60","migrationId":{"uuid":{"$uuid":"83e10e76-bd8d-4e65-8f5f-63755b6b8f49"}}}} {"t":{"$date":"2021-12-24T14:46:38.575-05:00"},"s":"I", "c":"MIGRATE", "id":21999, "ctx":"chunkInserter","msg":"Batch insertion failed","attr":{"error":"DocumentValidationFailure: Insert of { _id: ObjectId('61c6233d8715b5d2ac061688'), i: 0.0, d: \"w0HKLtlN64tsZfOs2OnSb0eBGMIhjkPwSja6Td3hp5hMIPueIAgKV336r7KRGrnPwsXW2iItnPK94ccitWPPFWF3ogSMznGOFqCwJVR6EGpOD9R6d79f7Ztdbsl7bSNRv9J29mfDLlXLYRUDa8OTLv...\" } failed. :: caused by :: Document failed validation"}} {"t":{"$date":"2021-12-24T14:46:38.592-05:00"},"s":"I", "c":"SHARDING", "id":22080, "ctx":"migrateThread","msg":"About to log metadata event","attr":{"namespace":"changelog","event":{"_id":"Alexs-MacBook-Pro.local:27021-2021-12-24T14:46:38.592-05:00-61c6239e1bda207efdb22431","server":"Alexs-MacBook-Pro.local:27021","shard":"shard02","clientAddr":"","time":{"$date":"2021-12-24T19:46:38.592Z"},"what":"moveChunk.to","ns":"test.c3","details":{"min":{"_id":{"$minKey":1}},"max":{"_id":{"$oid":"61c6233d8715b5d2ac06168a"}},"step 1 of 7":1,"step 2 of 7":0,"step 3 of 7":55,"to":"shard02","from":"shard01","note":"aborted"}}}} {"t":{"$date":"2021-12-24T14:46:38.614-05:00"},"s":"I", "c":"MIGRATE", "id":21998, "ctx":"migrateThread","msg":"Error during migration","attr":{"error":"migrate failed: Location51008: _migrateClone failed: :: caused by :: operation was interrupted"}}
This can be addressed by logging into each shard's PRIMARY and running collMod to set validationLevel: "off", however as these moveChunk.error aren't logged in the changelog it's not obvious what has failed without a deeper dive.
- related to
-
SERVER-62310 collMod command not sent to all shards for a sharded collection if no chunks have been received
- Closed