[CDRIVER-2265] Overly validating documents in bulk inserts Created: 06/Sep/17  Updated: 28/Oct/23  Resolved: 11/Jan/18

Status: Closed
Project: C Driver
Component/s: libmongoc
Affects Version/s: None
Fix Version/s: 1.10.0

Type: Improvement Priority: Major - P3
Reporter: Hannes Magnusson Assignee: Xiangyu Yao (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on CDRIVER-2296 Option to pass bson_validate_flags_t ... Closed
is depended on by CDRIVER-2461 Use validate: 0 in C Driver benchmarks Closed
is depended on by CDRIVER-2304 Deprecate mongoc_collection_update Closed
Related
related to CDRIVER-2959 "validate" option is passed in command Closed
is related to CDRIVER-1341 Driver should validate BSON documents... Closed
Backwards Compatibility: Fully Compatible

 Description   

  bool                                                                                 
  mongoc_bulk_operation_insert_with_opts (mongoc_bulk_operation_t *bulk,               
                                          const bson_t *document,                      
                                          const bson_t *opts,                          
                                          bson_error_t *error)                         
  {                                                                                    
....
     if (opts && bson_iter_init_find_case (&iter, opts, "legacyIndex") &&              
         bson_iter_as_bool (&iter)) {                                                  
        if (!_mongoc_validate_legacy_index (document, error)) {                        
           return false;                                                               
        }                                                                              
     } else if (!_mongoc_validate_new_document (document, error)) {                    
        return false;                                                                  
     }   

When creating a bulk of large number of small documents the _mongoc_validate_new_document () takes 20% of the total execution time, including a localhost roundtrip.
When creating a bulk of few but very large documents, _mongoc_validate_new_document() takes up to 90% of the time.

I think we can reduce this validation significantly. It should be a programming error to provide corrupt bson_t for example.
Maybe add a flag to the opts to skip the validation, which would assume the bson_t was already bson_validate()d by the application before presented to the bulk operations.



 Comments   
Comment by Githook User [ 26/Jan/18 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: CDRIVER-2265 warnings about treating int as enum
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/e244ec2c23d2d345d3849b572a5ff1593b495f8a

Comment by Githook User [ 11/Jan/18 ]

Author:

{'email': 'xiangyu.yao24@gmail.com', 'name': 'Xiangyu Yao', 'username': 'xy24'}

Message: CDRIVER-2265 specify validation flags in opts for bulk operations
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/1d12416cac19ae135281e8ac4a105e11c6fc8178

Comment by Kevin Albertson [ 12/Oct/17 ]

Another place we can consider skipping validation is in the collection update functions. CDRIVER-2205 added mongoc_update_one_with_opts, mongoc_update_many_with_opts, and mongoc_replace_one_with_opts. There is currently no way for callers to opt into skipping validation which occurs here. The mongoc_collection_update function however, allows users to skip validation with a MONGOC_UPDATE_NO_VALIDATE flag.

Comment by A. Jesse Jiryu Davis [ 07/Sep/17 ]

At the least, don't validate all strings for UTF-8 encoding, since DRIVERS-308 has not required it of us. Investigate further and think more about the design.

Generated at Wed Feb 07 21:14:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.