Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11873

Truncating Log Lines breaks UTF-8 characters

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.8
    • Component/s: Logging
    • Labels:
      None
    • ALL
    • Hide
      1. Set log level to 1 (so that the update is captured in the logs)
        db.adminCommand( { setParameter : 1, logLevel : 1 } )
        
      2. Run the update:
        db.test.update( {}, { $set : { description : "Anupam Roy is a Bengali lyricist, composer, singer from Kolkata, West Bengal, India. He is best known for his song Amake amar moto thakte dao (আমাকে আমার মত থাকতে দাও), which appeared on the soundtrack to the 2010 film Autograph"}})
        
      3. Check the log for broken UTF-8 character strings
      Show
      Set log level to 1 (so that the update is captured in the logs) db.adminCommand( { setParameter : 1, logLevel : 1 } ) Run the update: db.test.update( {}, { $set : { description : "Anupam Roy is a Bengali lyricist, composer, singer from Kolkata, West Bengal, India. He is best known for his song Amake amar moto thakte dao (আমাকে আমার মত থাকতে দাও), which appeared on the soundtrack to the 2010 film Autograph" }}) Check the log for broken UTF-8 character strings

      In the cases where long log lines are truncated, this can break UTF-8 characters.

      For example, the following update (with text from Wikipedia):

      db.test.update( {}, { $set : { description : "Anupam Roy is a Bengali lyricist, composer, singer from Kolkata, West Bengal, India. He is best known for his song Amake amar moto thakte dao (আমাকে আমার মত থাকতে দাও), which appeared on the soundtrack to the 2010 film Autograph"}})
      

      Will appear as:

      Wed Nov 27 11:03:02.373 [conn2] update test.test update: { $set: { LikeDescription: "Anupam Roy is a Bengali lyricist, composer, singer from Kolkata, West Bengal, India. He is best known for his song Amake amar moto thakte dao (আম�..." } } nscanned:0 nupdated:0 keyUpdates:0 locks(micros) w:52 0ms
      

      in the log (where the final characters before the ellipsis are "আম�...")

      This breaks the final UTF-8 character, as it has been truncated part way through the sequence.

            Assignee:
            Unassigned Unassigned
            Reporter:
            andre.defrere@mongodb.com Andre de Frere
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: