Uploaded image for project: 'Ruby Driver'
  1. Ruby Driver
  2. RUBY-2560

EncodingError raised when server returns invalid UTF-8 in error messages derived from user input

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.14.0
    • Component/s: BSON
    • Labels:

      If an error message truncates a string in such a way that it is no longer valid UTF-8, instead of raising an Mongo::Error::OperationFailure (or other exception) an EncodingError gets raised. This happens when an error message gets truncated on a byte in the middle of a UTF-8 character.

      Example:

      class MyDocument
        include Mongoid::Document
        field :name, type: String
        index({name: 1}, {unique: true})
      end
      
      MyDocument.create_indexes
      
      MyDocument.collection.insert_one({name: "(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻"})
      # this raises
      # EncodingError (String E11000 duplicate key error collection: my_db.my_documents index: name_1 dup key: { : "(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□?..." } is not valid UTF-8: bogus high bits for continuation byte)
      # the truncation fell in the middle of a ° character
      MyDocument.collection.insert_one({name: "(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻"})
      
      MyDocument.collection.insert_one({name: "a(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻"})
      # this raises
      # Mongo::Error::OperationFailure (E11000 duplicate key error collection: my_db.my_documents index: name_1 dup key: { : "a(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□..." } (11000) (on 127.0.0.1:27017, legacy retry, attempt 1))
      # which is expected
      MyDocument.collection.insert_one({name: "a(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻(╯°□°)╯︵ ┻━┻"})
      
      
      
      
      

       

            Assignee:
            Unassigned Unassigned
            Reporter:
            matt.hicks@braze.com Matt Hicks
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: