[SERVER-16338] Make it possible to store different data in the same field Created: 26/Nov/14  Updated: 06/Dec/22  Resolved: 24/Jan/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: alexandernst Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query
Participants:

 Description   

What I'm about to suggest could be (imho) a game changer about how we (developers) use databases.

Please note that this feature maybe won't apply/benefit 100% of the cases, but it will surely benefit the vast majority.

What I'm proposing is a way to save different data in a single field and make the database engine return one value or another depending on the scope of the query. Let me explain further...

Translations are really common in web-development. A single product's description (in an e-commerce website) has different translations in multiple languages (english, spanish, italian, french, etc...). It doesn't matter if we're using SQL or NoSQL, we're used to create as many fields as languages we need to support (desc_en, desc_es, desc_it, desc_fr, etc...) or even worse, different tables/documents.

But what if we could store all those different translations in the same field and make MongoDB return a specific translation or if that one is missing, return a "default" one?

Something like:

db.collections.products.find({
    _id: 123456789,
    $scope: {
        try: "spanish",
        catch: "english"
    }
}, { product_name: 1, product_desc: 1 });

That will query the database for the name and the description of the product with _id 12346789, first trying to get the spanish translation of those (scoped) fields, and if any of those fields doesn't have a spanish translation, it will fetch the english one. That way we could end up with a result like this:

{
    name: "T-Shirt",
    desc: "Camiseta de verano, 100% algodon"
}

The same way we could insert different translations in a single field using that same syntax (setting the "scope" of the field).

This could be used in translations (as my example shows), different metric units (car top speed -> km/h vs mph), different data representation (value of color -> human vs hex vs rgba), etc...

The possibilities are endless, and it will improve a lot how we currently deal with those kind of problems.

As an extra, it would be nice to "query the scopes" of a document, so we can know that the document containing a product does actually contain the english, spanish, italian and french "scopes" (translations).



 Comments   
Comment by Asya Kamsky [ 24/Jan/17 ]

alexandernst sorry I never saw your follow-up question on this. The answer is that you can do what you are asking using aggregation framework. You would use a $project or $addField stage with $ifNull to do something like:

db.coll.aggregate({$project:{name:{$ifNull:["$name.es-US", "$name.en-US"]}}})  

That says if there is nothing in "name.es-US" field then use the contents of field "name.en-US"

Comment by Ian Whalen (Inactive) [ 24/Jan/17 ]

Hi Alex, thanks a lot for filing this feature request. However, given that there is a workaround/way to implement the desired functionality we don't plan on implementing it.

Comment by alexandernst [ 15/May/16 ]

@Asya Kamsky Sorry for spamming this for support, but since you REed pretty fast in the first place, you might be able to clarify a related question I have. Would it be possible to do the same query, but with a field that looks like:

name: {
    'en-US': 'Foo',
    'es-ES': 'Bar'
}

Question keeps being the same. Is it possible to query "name" with a given language (for example, `es-ES`), but if that field doesn't exist, return the (guaranteed to exist) `en-US` field.
That way, the result would look like:

name: 'Bar'

Comment by Asya Kamsky [ 28/Nov/14 ]

It's the positional operator for projection - it allows specifying that you want to return the first member of the array that matched the query.

Comment by alexandernst [ 28/Nov/14 ]

This indeed looks just like what I'm proposing, with the only difference that what I'm proposing would be a little bit more hidden.

Btw, what does the '$' mean in the second condition in your queries? (the

{"description.$":1}

thing)

Comment by Asya Kamsky [ 28/Nov/14 ]

alexandernst very interesting idea, but I think you can already do that by using the rich document structure:

> db.products.findOne()
{
	"_id" : 1,
	"description" : [
		{
			"scope" : "latin",
			"text" : "this is product description in latin"
		},
		{
			"scope" : "esperanto",
			"text" : "this is product description in esperanto"
		},
		{
			"scope" : "default",
			"text" : "this is product description"
		}
	]
}
> db.products.find({"description.scope":{$in:["default","latin"]}},{"description.$":1})
{ "_id" : 1, "description" : [ { "scope" : "latin", "text" : "this is product description in latin" } ] }
> db.products.find({"description.scope":{$in:["default","french"]}},{"description.$":1})
{ "_id" : 1, "description" : [ { "scope" : "default", "text" : "this is product description" } ] }
> db.products.distinct("description.scope")
[ "default", "esperanto", "latin" ]

The advantage of doing it in your own application or middle layer is it will be infinitely flexible where if the DB layer implements it, you would be locked into that exact implementation with no flexibility.

Generated at Thu Feb 08 03:40:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.