[SERVER-5463] Indexing for document keys Created: 30/Mar/12  Updated: 15/Aug/12  Resolved: 02/Apr/12

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Glenn Maynard Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: indexing
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

(copied from http://groups.google.com/group/mongodb-user/browse_thread/thread/8d6f4fe174895fda)

Given a collection like this:

db.player = {
name: "Glenn",
scores: {
ping_pong:

{ points: 2000 }

,
golf:

{ points: 100 }

}
}

I want an index on each of the keys of scores, so I can efficiently eg. find and sort on 'scores.ping_pong.points'. However, there are too many keys to create indexes on 'scores.ping_pong', 'scores.golf', and so on--there many be hundreds of possible keys.

It would help if it was possible to create an index on 'scores.$key.points', which would effectively be a compound index on (key name, points), so this type of structure can be indexed without creating an index for every possible key. That way, a single index would work for both find(

{scores.ping_pong.points: 100}

) and find(

{scores.golf.points: 100}

). In the above data, two index entries would be created: one on ('ping_pong', 2000) and the other on ('golf', 100).

This could probably also be used to accelerate $exists. A (

{'name': 1, 'scores.$key': 1}

) index should allow find({name: 'Glenn', 'scores.ping_pong': {$exists: true}}) to be done efficiently.

A compound key of ('x', 'scores.$key.points', 'y') would expand to ('x', 'key name', 'points', 'y'). Similarly, ('a.$key.b', 'b.$key.c') expands to ('first key name', 'b', 'second key name', 'c').



 Comments   
Comment by Glenn Maynard [ 09/Apr/12 ]

Ping - please reopen.

Comment by Glenn Maynard [ 03/Apr/12 ]

I don't think this is the same as SERVER-1248. This feature would require document keys to be included in the index sort (eg. woCompare(considerFieldName=true)), to allow an 'a.$.c' index to be applied to find({}).sort(

{'a.b.c':1}

), so the index scales to an unlimited number of distinct keys, and so count() can be performed without an index scan.

The feature requested in SERVER-1248, key wildcards in queries, would need document keys to be excluded.

Comment by Eliot Horowitz (Inactive) [ 02/Apr/12 ]

Will try and make SERVER-1248 a bit better so its clear its query + index.

Comment by Glenn Maynard [ 02/Apr/12 ]

That's an ugly, hackish workaround. (I mentioned this in the list post, but omitted it here for brevity since it seems obvious. If arrays were a reasonable replacement for subdocuments, then we wouldn't need subdocuments.)

That said, this isn't really a duplicate of SERVER-1248. That one's asking for a query syntax; I'm requesting an index feature. The two would complement each other, of course.

Comment by Eliot Horowitz (Inactive) [ 02/Apr/12 ]

Special case of SERVER-1248

For your specific case, people tend to do:

{
  name : "Glenn" , 
  scores : [
    { name : "ping_pong" , points : 2000 } ,
    { name : "golf" , points : 100 }
  ]
}

Generated at Thu Feb 08 03:08:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.