[SERVER-5460] Error strings should escape embedded strings Created: 30/Mar/12  Updated: 25/Oct/21  Resolved: 25/Oct/21

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: 2.0.3
Fix Version/s: None

Type: Bug Priority: Trivial - P5
Reporter: Glenn Maynard Assignee: Geert Bosch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

> db.test.ensureIndex(

{x:1}

,

{unique:true}

)
> db.test.insert(

{'x': 'y\u0000a'}

)
> db.test.insert(

{'x': 'y\u0000a'}

)
E11000 duplicate key error index: test.test.$x_1 dup key: { : "y

The "y\0a" string has an embedded \0, which is being inserted literally into the E11000 error, causing it to be truncated at some point later.

Error message strings from the server should probably be escaped to not contain embedded nulls, so it doesn't need to be dealt with in each driver.



 Comments   
Comment by Geert Bosch [ 25/Oct/21 ]

This has been fixed in recent versions of MongoDB:

> db.test.insert( {'x': 'y\u0000a'} )
WriteResult({
	"nInserted" : 0,
	"writeError" : {
		"code" : 11000,
		"errmsg" : "E11000 duplicate key error collection: test.test index: x_1 dup key: { x: \"y\u0000a\" }"
	}
})

Comment by Glenn Maynard [ 05/Oct/13 ]

This bug still occurs in 2.4.5. I have to restate how broken it would be for drivers to try to parse these error strings:

db.test.ensureIndex({x:1}, {unique:true})
db.test.insert({'x': {a: "a", b: 2.0, c: "c"}})
db.test.insert({'x': {a: "a", b: 2.0, c: "c"}})
E11000 duplicate key error index: test.test.$x_1  dup key: { : { a: "a", b: 2.0, c: "c" } }
db.test.insert({'x': {a:'a", b: 2.0, c: "c'}})
db.test.insert({'x': {a:'a", b: 2.0, c: "c'}})
E11000 duplicate key error index: test.test.$x_1  dup key: { : { a: "a", b: 2.0, c: "c" } }

Comment by Glenn Maynard [ 04/Apr/12 ]

It's not reproducing and I'm not sure what's different about my test now; before, it appeared that data would be inserted into the collection, but indexes would never be used. Instead, I'm just getting an InvalidName exception in pymongo for nuls in collection names (by design), and a pymongo-specific bug if I use nuls in database names (which I've filed: PYTHON-342).

I assume that database and collection names are sent over the wire as a BSON cstring, which means the BSON stream becomes corrupted if it contains a null. (It's too bad that BSON has separate string and cstring types; it would have been possible to get the benefits of both types in a single type, and avoid the complexity and tradeoffs of the separate types. It's too late to fix now, I guess, since nobody is going to want to make backwards-incompatible changes to BSON.)

Comment by Tad Marshall [ 04/Apr/12 ]

We've been looking at (and considering changes to) our logic for allowing characters in database and collection names, so your observation that NUL can get into a collection name seems worth reporting. Can you file a (yet another) ticket for that? We may have no choice but to impose some rules there. If code A written in programming language X for platform P can create things that code B written in programming language Y for platform Q can't read, we have interop problems.

I am not convinced that we can "fix" this without creating breakage elsewhere, but it may be worth doing anyway.

Comment by Glenn Maynard [ 04/Apr/12 ]

Drivers shouldn't be trying to parse textual strings at all. That's a minefield--strings like "dup key:" might show up in the index names, or other strange scenarios that (strange or not) shouldn't break drivers. Textual error messages are for humans.

If drivers want access to specific error details, the data should be exposed to getLastError properly, eg:

{
'code': 11000,
'err': u'E11000 duplicate key error index: test.test.$x_1 dup key: { : 1 }',
'conflicting_index_name': 'test.test.$x_1',
'conflicting_key_path': 'x',
'conflicting_key':

{'x': 1}

,
'conflicting_id': ObjectId(...)
}

I've wanted this sort of thing myself, eg. to be able to tell which document conflicted, and what caused the conflict.

(As an aside, '\0 characters even seem to be allowed in collection names! I'm guessing that's not actually intended...)

Comment by Tad Marshall [ 04/Apr/12 ]

Hi Glenn,

Thanks for the report.

Here is your "debugging with submitter" bit. I am the assignee for your report and I need to clarify what "good behavior" would be in this case.

Clearly, printing truncated error messages is "bad behavior". But your comment that "Error message strings from the server should probably be escaped to not contain embedded nulls, so it doesn't need to be dealt with in each driver" is making me uncomfortable.

My concern is that we may introduce ambiguity into error strings if we try to "escape" them. We can tell the difference between "y\0000a" and "y
0000a" if we know that this escaping has been applied, but existing drivers are under no obligation to understand new rules that we impose.

I'm afraid that my "ideal" solution involves changes to all drivers so that they (in the new version) would expect escaping in (some?) text from the server, and yet this "solution" does nothing to address existing drivers that will continue to be used (and mostly work) after we make this change to the server.

What do you think would be the solution most compatible with existing code (drivers, apps) and making this work right in the shell?

There may be tradeoffs here, but since you noticed this perhaps you have thoughts on what should be traded off against what.

Thanks!

Tad

Comment by Eliot Horowitz (Inactive) [ 01/Apr/12 ]

No - that means its assigned to someone to try and reproduce.

Comment by Glenn Maynard [ 01/Apr/12 ]

(What does "debugging with submitter" mean? Does this not reproduce for you?)

Generated at Thu Feb 08 03:08:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.