[SERVER-484] Sparse Indexes (WAS: Add an "ignore missing" option to indexes) Created: 14/Dec/09  Updated: 12/Jul/16  Resolved: 21/Nov/10

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: None
Fix Version/s: 1.7.4

Type: Improvement Priority: Major - P3
Reporter: Keith Branton Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 25
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

All


Participants:

 Description   

The proposal is for another index option to be added that marks an index as not including any documents with missing index values in the index. For composite indexes this would ignore any documents where all the index values are missing. I would suggest if the index values are present but set to null they probably should be included in the index.

This would save considerable space and time when inserting into polymorphic collections where an index is only designed to affect a subset of the collection.

This would provide a large benefit to unique indexes, allowing a unique index to be created where only a percentage of the records in a collection are eligible to participate in the index. Currently it is not apparantly possible to add a unique index to a polymorphic collection where many records do not have the field to be indexed.



 Comments   
Comment by Scott Hernandez (Inactive) [ 08/Dec/10 ]

Currently this only supports a single field. See http://jira.mongodb.org/browse/SERVER-2193 for compound index support.

Comment by auto [ 21/Nov/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: more tests for sparse indexes SERVER-484
/mongodb/mongo/commit/e869a8e02c1209536edd3af6856084ddd2101274

Comment by auto [ 21/Nov/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: basic sparse indexes working SERVER-484
/mongodb/mongo/commit/e55fa7eb47873abe3236b3e1686d2a337dd6a86a

Comment by Remon van Vliet [ 29/Oct/10 ]

Voted on this. Mongo lends itself very well for user profile data and such but this issue blocks that for lazy registration and the like. Any eta on this?

Comment by Scott Hernandez (Inactive) [ 18/Mar/10 ]

I have added another issue for Partial Indexes: http://jira.mongodb.org/browse/SERVER-785

I think it would be able to take care of most of what this issue needs. The only challenging part would be to create a partial index that excludes "missing" value, as well as Nulls.

Comment by David Tildon [ 18/Mar/10 ]

I'd like to see this expanded to match Postgres' concept of partial indexes: http://www.postgresql.org/docs/8.3/static/indexes-partial.html

That way for the case above you could just create a unique index on 'email' where email is not nil. No need for a distinct "ignore missing" option.

In my own program I need to routinely delete old accounts where the user never finished registering, so I could create an index on 'created_at' where 'registration_finalized' is false.

And so on and so forth.

Comment by Chris Hanks [ 05/Jan/10 ]

This would be really, really useful. One of my use cases:

For login purposes, I index users based on their email address, which I would expect to be unique. However, I practice lazy registration, and let my users use the app for a while without officially registering and providing their email address - for those users, the email field is simply nil, so I currently can't use a unique index on this field.

Comment by Eliot Horowitz (Inactive) [ 18/Dec/09 ]

Sorry to say that comment was wrong and is now removed.
I do think we can do this though

Comment by Keith Branton [ 18/Dec/09 ]

Reading more of the documentation I now believe this is actually a bug, and that there's no need for a new option if index behavior is corrected to match the documentation, which states:

"When indexing, documents which do not have the specified field are not included in the index."

on this page:

http://www.mongodb.org/display/DOCS/Indexes

Based on that statement I believe I should be able to create a collection, add a unique index, and then insert two documents that do not have the indexed field on them successfully. What happens is I get a duplicate key errorindex complaining about a dup key value of null. I suspect, therefore, that documents that do not have the specified field are, in fact, included in the index.

Please explain if I've misunderstood. (And please, please don't change the documentation to match the behavior

Comment by Eliot Horowitz (Inactive) [ 14/Dec/09 ]

this would make behavior different based on indexes, which is generally something that is considered bad.

Generated at Thu Feb 08 02:54:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.