[SERVER-2675] Multikeys (indexing keys in a hash) for ranged queries Created: 04/Mar/11  Updated: 06/Dec/22  Resolved: 03/May/21

Status: Closed
Project: Core Server
Component/s: Internal Client
Affects Version/s: features we're not sure of
Fix Version/s: features we're not sure of

Type: New Feature Priority: Minor - P4
Reporter: Mario T. Lanza Assignee: Backlog - Query Optimization
Resolution: Done Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-37188 Rename "All Paths" index to "Wildcard... Closed
Related
related to SERVER-267 Wildcard support in index/query/proje... Backlog
Assigned Teams:
Query Optimization
Participants:

 Description   

Mongo currently supports multikey indexing (where we index values in an array). I suggest a feature for specifying an index on the keys and values of a given hash. This would be synonymous to the common property bag pattern which is sometimes implemented in SQL datastores.

properties (table)
==============
id (int, PK)
entity_id (int, FK)
key (varchar)
value (varchar)

Sample data (date of birth for several different entities);
INSERT INTO properties (entity_id, key, value) VALUES (101, 'dob', '12/15/1980')
INSERT INTO properties (entity_id, key, value) VALUES (276, 'dob', '11/05/1973')
INSERT INTO properties (entity_id, key, value) VALUES (333, 'dob', '11/05/1944')

Index: key, value

With an index in place we can find everyone born in the 70's.

In Mongo, to accomplish the same you have to add an index directly to the 'dob' attribute of the document. This is undesirable as our goal is to allow our documents (of any type) to share the same collection. (This is ideal for topic maps and social CMSes and many other use case, I'm sure.) It would be more useful if we could instead index the keys/values of a hash and perform ranged queries against those keys. Any property added to the property bag would be indexed. Powerful!

INSERT INTO properties (entity_id, key, value) VALUES (101, 'iq', 83)
INSERT INTO properties (entity_id, key, value) VALUES (276, 'iq', 120)
INSERT INTO properties (entity_id, key, value) VALUES (333, 'iq', 103)

Now we've added "IQ" to our entities. We can just as easily perform a greater-than query to grab the smartest people from our collection of whatevers.

I believe this approach is superior to the current multikey approach suggested on:

http://www.mongodb.org/display/DOCS/Using+Multikeys+to+Simulate+a+Large+Number+of+Indexes

The future seems to be all about schema-less DBs. Indexing a hash of keys/values seems totally in tune with what it means to be schema-less. Also, you should be able to use hash indexing as part of a compound index. This would be especially important as one of the indexables would probably be type.

db.whatevers.index({type: 1, attributes:

{ $hidx: 1}

) // $hidx = hash (key/value) index
db.whatevers.find({'type': 'person', 'attributes.iq': {$gt: 100}}) //uses index (not scan)
db.whatevers.find({'type': 'person', 'attributes.dob': {$gte: '1/1/1970', $lte: '12/31/1970'}}) //uses the same index

The document might look like:
{
type: "Person",
attributes:

{ name: "Jason", dob: '12/01/1950', iq: 99 }

}

Attributes expressed in this manner seems more natural than:

{
type: "Person",
attributes: [

{name: "Jason"}

,

{dob: '12/01/1950'}

,

{iq: 99}

]
}

Futhermore, as far as I can tell, the multikey style doesn't support ranged queries.

Thanks for reading. MongoDB is a great work!



 Comments   
Comment by Charlie Swanson [ 03/May/21 ]

Hello,

I was just searching across old tickets and I noticed this request was actually completed I believe. In 4.2 we released "Wildcard Indexes" to index any attribute with an unpredictable name.

I'm closing this ticket as "Done" because I believe we've met the initial request. Please let me know if there is still something missing from "Wildcard Indexes" that you'd like to see for this use case.

Comment by Mario T. Lanza [ 04/Mar/11 ]

One potential issue: I suppose there could be attributes that could be extremely lengthy (e.g. the "body" property of an essay, the equivalent of the SQL text datatype). In this case you might have to limit how many characters are indexed.

db.whatevers.index({type: 1, attributes:

{ $hidx: 1, $max_chars: 200}

)

Of course, this would only be applicable on text values and there would be an implied upper limit if the option was omitted.

Also, this might be a better way of expressing the index:

db.whatevers.index(

{'type': 1, 'attributes.*': 1}

)

Generated at Thu Feb 08 03:00:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.