Uploaded image for project: 'C Driver'
  1. C Driver
  2. CDRIVER-5838

bson_validate failed to check invalid utf8

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • None
    • C Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      bson_validate returns true for invalid utf8 string. Please refer to the example below for more detail. 

      The error seems to occur in _bson_iter_validate_document (called within bson_validate ). The validation process is based on iteration by bson_iter_visit_all, however, utf8 validation will be also called in bson_iter_visit_all, so the validation function _bson_iter_validate_utf8 won't be call for invalid utf8. 

       Code in bson_iter_visit_all:

      case BSON_TYPE_UTF8: {
         uint32_t utf8_len;
         const char *utf8;
      
         utf8 = bson_iter_utf8 (iter, &utf8_len);
      
         if (!bson_utf8_validate (utf8, utf8_len, true)) {
            iter->err_off = iter->off;
            return true;
         }
      
         if (VISIT_UTF8 (iter, key, utf8_len, utf8, data)) {
            return true;
         }
      } break;
      

      Environment

      Example below is tested with mongoc 1.29.1 and 1.17.6, gcc 10.2.1.

      How to Reproduce

      #include <bson/bson.h>
      
      int
      main (int argc, char *argv[])
      {
         static const char invalid[] = {0xed, 0xa0, 0xa5, 0};
      
         printf ("validate utf8: %d\n",
                 bson_utf8_validate (invalid, strlen (invalid), false));
         // output: validate utf8: 0
      
         bson_t b;
         bson_init (&b);
         printf ("append utf8: %d\n", bson_append_utf8 (&b, "key", -1, invalid, -1));
         // output: append utf8: 1
      
         bson_iter_t child;
         if (!bson_iter_init (&child, &b)) {
            printf ("bson_iter_init failed\n");
            return 1;
         }
      
         static bson_visitor_t bson_validate_funcs = {};
         bson_iter_visit_all (&child, &bson_validate_funcs, NULL);
         printf ("bson iter err: %d\n", child.err_off > 0);
         // output: bson iter err: 1
      
         printf ("bson validate: %d\n", bson_validate (&b, BSON_VALIDATE_UTF8, NULL));
         // output: bson validate: 1
      
         return 0;
      }
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            wait4pumpkin@gmail.com wait4pumpkin N/A
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: