Uploaded image for project: 'Ruby Driver'
  1. Ruby Driver
  2. RUBY-432

Incorrectly encoded string data can break entire collections

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 1.6.0, 1.6.1
    • Fix Version/s: 1.6.3
    • Component/s: None
    • Labels:
    • Environment:
      Fedora 16 x86_64, RHEL 5 x86_64, Ruby 1.9.3, ruby driver 1.6.0 & 1.6.1, bson 1.6.0 & 1.6.1, bson_ext 1.6.0 & 1.6.1, mongo 2.0.4 64-bit

      Description

      Inserting improperly-encoded string data into a collection can in some cases bypass BSON sanity checks and end up writing data to the database that results in a server exception whenever you attempt to read the inserted row from the database.

      Attempting an upsert with the invalid data when the database doesn't exist results in the connection simply dropping. Attempting an insert (whether or not the DB exists yet) results in invalid data being written to the DB.

      This is recoverable from by performing a compact of the affected collection, but it results in the loss of the entire document containing the bad data. In the best case, I'd expect mongo to save the string data as-is and reconstitute it byte-for-byte, but since I don't expect the DB saves encoding information, this would probably need to be a driver-level fix that would reject any attempt to write invalid UTF-8 to the database.

      I saw that a similar issue was reported for the PHP driver, and was fixed by preventing invalid UTF-8 from being written to the database. Since this is fundamentally an issue with string encodings being munged, and mongod has no concept of alternate encodings, this has to be a driver fix. A core fix would be nice (at least for safe writes), but wouldn't cover all cases, since non-safe writes wouldn't ever see that the write has failed.

        Attachments

        1. kaboom-upsert.txt
          4 kB
        2. kaboom-insert.txt
          4 kB
        3. kaboom.rb
          0.7 kB

          Issue Links

            Activity

              People

              Assignee:
              tyler@10gen.com Tyler Brock
              Reporter:
              cheald Chris Heald
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: