[SERVER-60708] db.collection.validate() returns `nInvalidDocuments: 0` when invalid documents are present Created: 14/Oct/21  Updated: 29/Oct/23  Resolved: 31/Mar/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.2, 5.0.3
Fix Version/s: 6.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Elliot Metsger Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: validate, validation
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-15206 Investigate changes in SERVER-60708: ... Closed
Related
related to SERVER-53635 Add document schema validation to val... Closed
is related to SERVER-65078 'validate' to report non-compliant do... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

use test
db.runCommand( { "drop": "testValidation" } )
db.createCollection("testValidation", {
    "validationLevel": "strict",
    "validationAction": "warn",
    "validator": { "$jsonSchema": {
                                  "required": [
                                    "foo"
                                  ],
                                  "properties": {
                                    "foo": {
                                      "bsonType": "string"
                                    }
                                  }
                                }
    }
})
db.testValidation.insertMany([
    { "foo": "bar" },
    { "quux": "fizz" }
])
db.testValidation.validate() // logs a warning but does not include it in validation response
db.runCommand( { "collMod": "testValidation", "validationAction": "error" } ) 
db.testValidation.validate() // logs a warning, includes warning in validation response, but nInvalidDocuments=0

Sprint: Execution Team 2022-04-04
Participants:

 Description   

If I create a collection with validationLevel: strict and validationAction: warn, and subsequently run db.collection.validate() on a collection containing invalid documents, I would expect the number of invalid documents and the warnings array to be populated

However, running `db.collection.validate()` on a collection with invalid documents returns unexpected results:

  • nInvalidDocuments is 0
  • and no messages present in the warnings[] or errors[] array

The logs for the server do log the errors, but the errors are not reflected in the output to db.collection.validate().

If I use validationAction: error, then execute `db.collection.validate()`, there is a warning present in warnings[] but nInvalidDocuments is still 0.



 Comments   
Comment by Githook User [ 31/Mar/22 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-60708 Improve collection validation response for document schema validation
Branch: master
https://github.com/mongodb/mongo/commit/307f3716159ff776f5d93250743620fd62d0ca08

Comment by Elliot Metsger [ 15/Oct/21 ]

> I do think that there is legitimate confusion around what nInvalidDocuments means as a field name now, and that's something we should consider fixing. At a minimum if we decide against changes, we'll document this more clearly.

+1

> For the broader question of how to identify invalid documents programmatically, I'd like to encourage you to make that feature request at https://feedback.mongodb.com/. It is not currently straightforward to do that without access to the logs. For this broader question, I'd like to request you ask our community for help by posting on the MongoDB Developer Community Forums.

Sounds good! I'll raise the issue there, thanks.

Comment by Eric Sedor [ 14/Oct/21 ]

I do think that there is legitimate confusion around what nInvalidDocuments means as a field name now, and that's something we should consider fixing. At a minimum if we decide against changes, we'll document this more clearly.

For the broader question of how to identify invalid documents programmatically, I'd like to encourage you to make that feature request at https://feedback.mongodb.com/. It is not currently straightforward to do that without access to the logs. For this broader question, I'd like to request you ask our community for help by posting on the MongoDB Developer Community Forums.

Comment by Elliot Metsger [ 14/Oct/21 ]

Hi Eric! Thanks for the prompt triage and issue reference. Ok, so my expectations regarding the response from db.collection.validate() were incorrect! No problem.

My follow-up question would be: when validationAction is "warn", how are schema validations exposed or reported by Mongo? For example, when schemaValidation is "error", the Go driver returns an `error` like the following when inserting an invalid doc:

[Document failed validation: {"failingDocumentId": {"$oid":"6168982760e28260aa721c7f"},"details": {"operatorName": "$jsonSchema","schemaRulesNotSatisfied": [{"operatorName": "required","specifiedAs": {"required": ["foo"]},"missingProperties": ["foo"]}]}}]

When validationAction is "warn", no error is returned by Insert operations (and that's OK with me); how would I go about identifying the invalid documents (in a programmatic fashion, e.g. using a driver)? If db.collection.validate() were updated to accommodate and report back on invalid documents, that'd be fantastic. Or perhaps return a cursor over the invalid documents if the size of the result set is a concern.

Thanks for the consideration and the quick response, much appreciated!

Comment by Eric Sedor [ 14/Oct/21 ]

Thanks emetsger@gmail.com, it looks like the inclusion of schema validation in the more established validate() command (in SERVER-53635) has led to some confusion with overloaded terms, where:

  • nInvalidDocuments is never counting documents that are invalid from the perspective of a collection validator
  • validationAction of "warn" doesn't imply an issue will be reported in the validate command response of "warnings"
  • validationAction of "error" implies an issue will be reporter in the validate command response of "warnings"

I'm passing this on to an appropriate team for consideration.

Generated at Thu Feb 08 05:50:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.