[CDRIVER-2018] Introduce bulk write options to disable BSON validation Created: 02/Feb/17 Updated: 03/May/17 Resolved: 23/Mar/17 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc |
| Affects Version/s: | 1.6.0 |
| Fix Version/s: | 1.7.0 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Jeremy Mikola | Assignee: | Backlog - C Driver Team |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
When As it relates to the PHP driver, this means that we'd only be able to throw a BulkWriteException at execution time (from Manager::executeBulkWrite()) instead of an InvalidArgumentException from the offending method (e.g. BulkWrite::insert()). We actually discovered this because BulkWriteException attempts to refer to the server handling execution, and the selected server was null. This may only be an issue for mongoc_bulk_operation_insert(), which may be at a disadvantage because it is void and cannot return a boolean indicating success or failure (unlike the update and replace "with_opts" functions). For this reason, I think we either need a new insert function that can return a boolean or some public API that can expose the mongoc_bulk_operation_t's error so that we can check it after each attempted insert. The update and replace functions have "with_opts" variants, which return a boolean indicating success or failure, as well as a bson_error_t * output argument to store any error encountered. A similar API for insert would be appreciated. |
| Comments |
| Comment by Githook User [ 23/Mar/17 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: |
| Comment by Jeremy Mikola [ 13/Feb/17 ] |
|
Summarizing our discussion earlier today: mongoc_collection_insert() and mongoc_collection_insert_bulk() both support an undocumented MONGOC_INSERT_NO_VALIDATE flag, which disables all validation. mongoc_collection_insert() and MONGOC_INSERT_NO_VALIDATE is used internally by libmongoc for legacy index creation, but there is no evidence of the option being used with mongoc_collection_insert_bulk(). If a new option is implemented per this issue, we'd suggest a name related to legacy index creation rather than a generic name that might encourage users to use it for other situations. The PHP driver accumulates all documents to insert in a mongoc_bulk_operation_t via the MongoDB\Driver\BulkWrite methods prior to the object being passed to MongoDB\Driver\Manager::executeBulkWrite() with a target namespace. There is no public API in libmongoc to access data that has been added to mongoc_bulk_operation_t, otherwise we might be able to check the namespace when executeBulkWrite() is called, pull insert documents out, and defer to mongoc_collection_insert() or mongoc_collection_insert_bulk() with MONGOC_INSERT_NO_VALIDATE. We also noted that the introduction of document validation in libmongoc 1.6.0 ( I pinged derick to ask if he had any other ideas for working around this short of the API requested in this ticket. |
| Comment by Jeremy Mikola [ 08/Feb/17 ] |
|
We discussed creating a bool mongoc_bulk_operation_insert_with_opts() function that also accepts const bson_t *opts and bson_error_t *error arguments. Rather than introduce an option for disabling "dot keys" validation, we discussed having a general "validateBson" option for all of the "with_opts" methods that allows libmongoc to skip its BSON validation introduced in |
| Comment by Jeremy Mikola [ 03/Feb/17 ] |
Correct. All writes are issued through the MongoDB\Driver\Manager::executeBulkWrite() method. The base driver has three basic methods for executing commands, queries, and bulk writes.
I'm aware of the logic this function and others have to fall back for older server versions, but that is something we handle ourselves in PHPLIB after selecting a server is checking it's min/max wire versions. This was all done purposefully by design to keep our C layer as slim as possible. The only way I can imagine we'd be able to use mongoc_collection_create_index_with_opts is if, at the time a bulk write was executed, we checked if the target collection was "system.indexes" on a pre-2.6 server and then used some API, which does not yet exist, to dissect it and obtain the raw insert statements. We'd then have to throw exceptions if the bulk write consisted of anything more than a series of insert statements. That all seems quite a bit more hacky and fragile than a mechanism to bypass the "dot" validation. |
| Comment by A. Jesse Jiryu Davis [ 03/Feb/17 ] |
|
"insert" is your only API for insertions? Can you use mongoc_collection_create_index_with_opts? That function tries the modern "createIndexes" command and falls back to inserting into the system.indexes collection with MONGOC_INSERT_NO_VALIDATE set. The driver does all the work for you there. |
| Comment by Jeremy Mikola [ 02/Feb/17 ] |
|
jesse: See my comment in PHPC-712:
With libmongoc 1.6.0, there is no way for PHPLIB to create indexes on MongoDB 2.4 and earlier, since MongoDB\BulkWrite::insert() is our only API for insertions. I would propose a mongoc_bulk_operation_insert_with_opts() function that accepts an option to exclude BSON_VALIDATE_DOT_KEYS from the validation flags. In effect, this would force the server to validate that aspect of the document. |