[SERVER-29594] "update" can produce documents with dotted field names Created: 12/Jun/17  Updated: 27/Oct/23  Resolved: 18/Oct/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.5.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: Backlog - Query Team (Inactive)
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-32250 v3.6 $setOnInsert may create objects ... Closed
Related
is related to SERVER-24174 Inconsistent rule for storing dotted ... Closed
is related to PYTHON-1291 Test Failure - test_*_with_invalid_ke... Closed
is related to SERVER-29342 CollectionShardingState to support op... Closed
Assigned Teams:
Query
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

In MongoDB 3.4.4 with latest PyMongo, this raises an error:

from pymongo import MongoClient
 
db = MongoClient().test
db.collection.insert({'_id': 1, 'hello': 'world'})
db.collection.update({'hello': 'world'},
                     {'_id': 1, 'hello': 'world', 'a.b': 'c'})

The server returns:

{'index': 0, 'code': 57, 'errmsg': u"The dotted field 'a.b' in 'a.b' is not valid for storage."}

In the latest nightly (commit 6fe7250) on macOS, this script succeeds and updates the document. The collection now contains this document:

{ "_id" : 1, "hello" : "world", "a.b" : "c" }



 Comments   
Comment by Nicholas Zolnierz [ 18/Oct/17 ]

Marking this as "Works as Designed" since the server is moving towards relaxing the constraints on field name restrictions (including dots and $-prefixed). The underlying issue is that there's no way to express a query over a dotted field name vs a path to an embedded field, which is being tracked in SERVER-30575. The documentation on field name restrictions also needs to be updated, tracked in DOCS-10896.

Note that as Jesse states in one of the comments, the field name checks in the drivers will still remain. SERVER-30575 will indicate DCN to relax those checks.

Comment by A. Jesse Jiryu Davis [ 29/Jun/17 ]

Currently the shell prevents us from inserting such documents:

> db.test.insert({'a.b': 1})
2017-06-28T20:31:06.342-0400 E QUERY    [thread1] Error: can't have . in field names [a.b] :
DBCollection.prototype._validateForStorage@src/mongo/shell/collection.js:244:1

Circumventing the shell, we can insert a document that appears to have a dotted field name, but in fact it produces a subdocument:

> db.runCommand({insert: 'test', documents: [{'a.b': 1}]})
{ "n" : 1, "ok" : 1 }
> db.test.findOne()
{
	"_id" : ObjectId("595433470acce0c8625b0ac5"),
	"hello" : "world",
	"foo" : {
		"bar" : "baz"
	}
}

PyMongo, meanwhile, prevents such an insert, same as the shell's "insert" helper:

>>> pymongo.MongoClient().db.test.insert({'a.b': 1})
... traceback ...
InvalidDocument: key 'a.b' must not contain '.'

The most eccentric among these behaviors, I believe, is the one I originally reported: "update" can produce documents with dotted field names. I vote for Andy's suggestion "to more uniformly enforce the restrictions and to change current internal uses" so that we don't ever let users create such documents.

Comment by Bernie Hackett [ 28/Jun/17 ]

An alternative would be to more uniformly enforce the restrictions and to change current internal uses. This could be done, but would still risk harming users who were intentionally or inadvertently using dotted field names.

Until we can actually support queries on dotted key names, which I wholeheartedly support, I think it makes more sense to more uniformly enforce the restrictions. Fields with dotted key names are a frustrating foot gun, even for experienced users.

Comment by Andy Schwerin [ 20/Jun/17 ]

The behavior of the update path was intentionally relaxed on the 3.5 development branch to allow updates to produce dotted field names. This makes those execution paths consistent with the insert path and the "replacement update" path. While the MongoDB documentation says that MongoDB forbids documents from containing dotted field names, it has always allowed it and many internal features continue to depend on it. Oplog entries, index descriptors and routing table entries all use dotted field names, typically to describe a field path.

I believe that we should consider dotted field names as legal, with the caveat that there is a limitation in the current query languages that prevent one from describing field paths over them. This makes it impossible to query through a dotted field, or to index a dotted field, among other limitations. We should consider lifting those restrictions as feature requests, and schedule it as time allows.

An alternative would be to more uniformly enforce the restrictions and to change current internal uses. This could be done, but would still risk harming users who were intentionally or inadvertently using dotted field names. We will need to make a decision before releasing the next stable release about whether to close this ticket "Works As Designed", or to create a plan to enforce the "no dotted paths" restriction consistently.

Comment by A. Jesse Jiryu Davis [ 13/Jun/17 ]

I think that if a user creates a document with a dotted field name they're making a mistake. For one thing, the manual has said for a while that dotted field names are prohibited. For another, such field names aren't queryable in any server version:

from pymongo import MongoClient
 
db = MongoClient().test
db.collection.update({'hello': 'world'},
                     {'_id': 1, 'hello': 'world', 'a.b': 'c'},
                     upsert=True)
 
# Finds {'_id': 1, 'hello': 'world', 'a.b': 'c'}
print(db.collection.find_one())
# Finds None
print(db.collection.find_one({'a.b': 'c'}))

I propose that we continue to prevent users from creating documents with dotted field names. There's no use case for creating dotted field names, so if we let users do so by mistake, then we're letting them shoot themselves in the foot.

Comment by Andy Schwerin [ 13/Jun/17 ]

This was changed intentionally. Since the advent of secondary indexes in mongodb, certain write paths have allowed the creation of dotted field names, but not others. It seemed easier and more flexible to me to say that it was a bug that we couldn't express field paths over dotted fields than to say it was a bug that some execution paths allowed the introduction of dotted fields.

I'd be happy to discuss the merits of the change further.

Comment by Tess Avitabile (Inactive) [ 12/Jun/17 ]

This appears to have changed in this commit.

Generated at Thu Feb 08 04:21:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.