[SERVER-7424] Shell trims multiple whitespace characters in strings when displaying (data not affected) Created: 19/Oct/12  Updated: 11/Jul/16  Resolved: 11/Mar/13

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: 2.0.7, 2.2.0
Fix Version/s: 2.4.4, 2.5.0

Type: Bug Priority: Minor - P4
Reporter: Adam Comerford Assignee: Tad Marshall
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Tested on Linux - 2.2.0 and 2.0.7


Issue Links:
Duplicate
is duplicated by SERVER-6893 Shell tojson() shouldn't compact mult... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Participants:

 Description   

The shell trims whitespace when displaying find results, but the data is correctly inserted and can be queried appropriately:

From here: http://dba.stackexchange.com/questions/27200/preserving-whitespace-in-mongodb-values

Quick test to show it happening:

// insert a string with some leading and trailing spaces
mongos> db.test.insert({"name":"    adam    "})
// now do a find for that string (success)
mongos> db.test.find({"name":"    adam    "})
{ "_id" : ObjectId("50814e5c2a17d922cd25974f"), "name" : " adam " }
// however, the display is off - only one leading/trailing space shown
// just to be sure, search for the output again (no results)
mongos> db.test.find({"name":" adam "})
mongos>



 Comments   
Comment by auto [ 13/May/13 ]

Author:

{u'date': u'2013-03-11T12:35:26Z', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-7424 Use simpler regex to produce single line output in tojson()

When displaying small objects, convert only tabs, carriage returns and
newlines to single spaces, leaving other whitespace alone.
Branch: v2.4
https://github.com/mongodb/mongo/commit/63f4bc287d175ab3414025402c47b00c2f238ac9

Comment by auto [ 11/Mar/13 ]

Author:

{u'date': u'2013-03-11T12:35:26Z', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-7424 Use simpler regex to produce single line output in tojson()

When displaying small objects, convert only tabs, carriage returns and
newlines to single spaces, leaving other whitespace alone.
Branch: master
https://github.com/mongodb/mongo/commit/d36db8ac30a56d39f4af58e573ee417450798c96

Comment by Tad Marshall [ 11/Mar/13 ]

The code is trying to make things print "nicely" when formatting has not been explicitly set and when the output can fit on one line (before processing, hmmm ...). It compresses every block of whitespace to a single space. This is probably helpful for whitespace between fields, but it isn't selective and it smushes whitespace in string fields to a single space.

src/mongo/shell/types.js lines 539 to 545 (in tojson()):

    case "object":{
        var s = tojsonObject(x, indent, nolint);
        if ((nolint == null || nolint == true) && s.length < 80 && (indent == null || indent.length == 0)){
            s = s.replace(/[\s\r\n ]+/gm, " ");
        }
        return s;
    }

If you add text to the object to push its unprocessed length to 80 or more characters, it goes multi-line and you see the spaces:

> adam
{ "_id" : ObjectId("513dbef8d33345132fe2a81b"), "adam" : " adam " }
> adam.moreText = "Just add some text"
Just add some text
> adam
{
        "_id" : ObjectId("513dbef8d33345132fe2a81b"),
        "adam" : "      adam      ",
        "moreText" : "Just add some text"
}

You don't need to store and fetch a document to reproduce this, you can just create a local object in the shell.

The regular expression is doing too much. The tojsonObject() function inserts tabs and newlines and doesn't add multiple spaces or carriage returns, so stripping just tabs and newlines gives the more concise display without hammering spaces in strings. Tabs and newlines in strings should already have been escaped to backslash-t and backslash-n, so stripping them should not hurt anything.

> var adam = {  "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"),  "adam" : "      adam      " }
> adam
{ "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"), "adam" : " adam " }
> var s = tojsonObject(adam)
> s
{
        "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"),
        "adam" : "      adam      "
}
> s.length
79
> var u = s.replace(/[\s\r\n ]+/gm, " ");
> u
{ "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"), "adam" : " adam " }
 
> var v = s.replace(/[\t\n]+/gm, " ");
> v
{ "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"), "adam" : "      adam      " }
> 

Comment by Tad Marshall [ 19/Oct/12 ]

Yes, something funny in the display code:

> db.b.insert({adam:"      adam      "})
> var adam=db.b.find({adam:{$exists:1}}).next()
> adam
{ "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"), "adam" : " adam " }
> printjson(adam)
{ "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"), "adam" : " adam " }
> printjsononeline(adam)
{  "_id" : ObjectId("50816b7e4af6ff7c8679c7d2"),  "adam" : "      adam      " }

Generated at Thu Feb 08 03:14:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.