Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46873

Invalid utf-8 in json logs

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.0
    • Affects Version/s: 4.3.4
    • Component/s: Logging
    • None
    • Fully Compatible
    • ALL
    • Service arch 2020-05-04, Service arch 2020-05-18, Service arch 2020-06-01, Service arch 2020-06-15, Service arch 2020-06-29, Service arch 2020-07-13, Service Arch 2020-07-27, Service Arch 2020-08-10, Service Arch 2020-08-24, Security 2021-02-08

      In this file: https://s3.amazonaws.com/mciuploads/mongo-ruby-driver/mongo-latest__mongodb-version~latest_topology~replica-set_auth-and-ssl~noauth-and-nossl_ruby~ruby-2.7_os~ubuntu1604/3da63736cf6fd65edb655fe21aa94b970796b802/5e6b12882a60ed16d60d365b/mongo_ruby_driver_mongo_latest__mongodb_version~latest_topology~replica_set_auth_and_ssl~noauth_and_nossl_ruby~ruby_2.7_os~ubuntu1604_patch_3da63736cf6fd65edb655fe21aa94b970796b802_5e6b12882a60ed16d60d365b_20_03_13_04_56_41/logs/mongo_ruby_driver_mongo_latest__mongodb_version~latest_topology~replica_set_auth_and_ssl~noauth_and_nossl_ruby~ruby_2.7_os~ubuntu1604_test_mlaunch_patch_3da63736cf6fd65edb655fe21aa94b970796b802_5e6b12882a60ed16d60d365b_20_03_13_04_56_41-0-mongodb-logs.tar.gz, in drivers-tools/.evergreen/orchestration/db/ruby-driver-rs/rs2/mongod.log , the server produced 4 lines that contain invalid utf-8 as reported by Ruby. The line numbers are 21111, 21962, 22041, 22187. These lines all look like this:

      {"t":{"$date":"2020-03-13T05:12:21.172+0000"},"s":"I", "c":"TXN",     "id":51802,"ctx":"conn1036","msg":"transaction","attr":{"parameters":{"lsid":{"id":{"$uuid":"ee279b85-d0db-4273-948c-f3f40327d56a"},"uid":{"$binary":{"base64":"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}},"txnNumber":1,"autocommit":false,"readConcern":{"provenance":"implicitDefault"}},"readTimestamp":"Timestamp(0, 0)","ninserted":1,"keysInserted":1,"terminationCause":"\ufffd���\ufffdU\u0000\u0000\u0000","timeActive":287,"timeInactive":1011340,"numYields":0,"locks":{"ReplicationStateTransition":{"acquireCount":{"w":3}},"Global":{"acquireCount":{"w":2}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"w":3}},"Mutex":{"acquireCount":{"r":3}}},"storage":{},"wasPrepared":false,"durationMillis":1011}}
      

      terminationCause appears to look like this:

      \"terminationCause\":\"\\ufffd\xED\xA9\x9B\\ufffdU\\u0000\\u0000\\u0000\"
      

      Ruby has this to say about it:

      (byebug) line.force_encoding('utf-8').encode('utf-16')
      *** Encoding::InvalidByteSequenceError Exception: "\xED" followed by "\xA9" on UTF-8
      

      According to https://en.wikipedia.org/wiki/JSON#Data_portability_issues json should be encoded in utf-8.

      Build: https://evergreen.mongodb.com/build/mongo_ruby_driver_mongo_latest__mongodb_version~latest_topology~replica_set_auth_and_ssl~noauth_and_nossl_ruby~ruby_2.7_os~ubuntu1604_patch_3da63736cf6fd65edb655fe21aa94b970796b802_5e6b12882a60ed16d60d365b_20_03_13_04_56_41/

            Assignee:
            mark.benvenuto@mongodb.com Mark Benvenuto
            Reporter:
            oleg.pudeyev@mongodb.com Oleg Pudeyev (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: