[SERVER-484] Sparse Indexes (WAS: Add an "ignore missing" option to indexes) Created: 14/Dec/09 Updated: 12/Jul/16 Resolved: 21/Nov/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | None |
| Fix Version/s: | 1.7.4 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Keith Branton | Assignee: | Eliot Horowitz (Inactive) |
| Resolution: | Done | Votes: | 25 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
All |
||
| Participants: |
| Description |
|
The proposal is for another index option to be added that marks an index as not including any documents with missing index values in the index. For composite indexes this would ignore any documents where all the index values are missing. I would suggest if the index values are present but set to null they probably should be included in the index. This would save considerable space and time when inserting into polymorphic collections where an index is only designed to affect a subset of the collection. This would provide a large benefit to unique indexes, allowing a unique index to be created where only a percentage of the records in a collection are eligible to participate in the index. Currently it is not apparantly possible to add a unique index to a polymorphic collection where many records do not have the field to be indexed. |
| Comments |
| Comment by Scott Hernandez (Inactive) [ 08/Dec/10 ] |
|
Currently this only supports a single field. See http://jira.mongodb.org/browse/SERVER-2193 for compound index support. |
| Comment by auto [ 21/Nov/10 ] |
|
Author: {'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}Message: more tests for sparse indexes |
| Comment by auto [ 21/Nov/10 ] |
|
Author: {'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}Message: basic sparse indexes working |
| Comment by Remon van Vliet [ 29/Oct/10 ] |
|
Voted on this. Mongo lends itself very well for user profile data and such but this issue blocks that for lazy registration and the like. Any eta on this? |
| Comment by Scott Hernandez (Inactive) [ 18/Mar/10 ] |
|
I have added another issue for Partial Indexes: http://jira.mongodb.org/browse/SERVER-785 I think it would be able to take care of most of what this issue needs. The only challenging part would be to create a partial index that excludes "missing" value, as well as Nulls. |
| Comment by David Tildon [ 18/Mar/10 ] |
|
I'd like to see this expanded to match Postgres' concept of partial indexes: http://www.postgresql.org/docs/8.3/static/indexes-partial.html That way for the case above you could just create a unique index on 'email' where email is not nil. No need for a distinct "ignore missing" option. In my own program I need to routinely delete old accounts where the user never finished registering, so I could create an index on 'created_at' where 'registration_finalized' is false. And so on and so forth. |
| Comment by Chris Hanks [ 05/Jan/10 ] |
|
This would be really, really useful. One of my use cases: For login purposes, I index users based on their email address, which I would expect to be unique. However, I practice lazy registration, and let my users use the app for a while without officially registering and providing their email address - for those users, the email field is simply nil, so I currently can't use a unique index on this field. |
| Comment by Eliot Horowitz (Inactive) [ 18/Dec/09 ] |
|
Sorry to say that comment was wrong and is now removed. |
| Comment by Keith Branton [ 18/Dec/09 ] |
|
Reading more of the documentation I now believe this is actually a bug, and that there's no need for a new option if index behavior is corrected to match the documentation, which states: "When indexing, documents which do not have the specified field are not included in the index." on this page: http://www.mongodb.org/display/DOCS/Indexes Based on that statement I believe I should be able to create a collection, add a unique index, and then insert two documents that do not have the indexed field on them successfully. What happens is I get a duplicate key errorindex complaining about a dup key value of null. I suspect, therefore, that documents that do not have the specified field are, in fact, included in the index. Please explain if I've misunderstood. (And please, please don't change the documentation to match the behavior |
| Comment by Eliot Horowitz (Inactive) [ 14/Dec/09 ] |
|
this would make behavior different based on indexes, which is generally something that is considered bad. |