Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Index Maintenance
Labels:
None

Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

This patch was initially posted as a comment to ~~SERVER-2193~~ but that ticket is on a bit of a different topic, so we discussed with thomasr that it should rather be made an independent ticket. Original comment follows:

Support for sparse index with multiple fields is, as far as I can see, implemented in the current code base and has been for a long time (and with written tests). There is a uassert in the code that suppose to prevent a user from creating a sparse index with multiple fields but the implementation of this is wrong so it will never kick in.

So, if a user creates a sparse index with multiple fields it will work. The semantics for the current implementation is; "exclude a document from the index if all index fields are missing from the document".

This "mode" of the index might benefit some, but according to many of the wishes in the discussion in this issue and in https://jira.mongodb.org/browse/SERVER-785 the semantics folks are looking for is; "only include a document in the index if all index fields are present in the document".

Getting support for this second "mode" of the index is simply a matter of changing numNotFound == _spec._nFields to numNotFound != 0 here: https://github.com/mongodb/mongo/blob/master/src/mongo/db/indexkey.cpp#L429

Provided that I have not missed any complicated corner case regarding this, I have the following suggestions:

1. Change the documentation so that it is clear that sparse index with multiple fields is supported.
2. Add a additional config parameter that can be used together with the sparse: true option to flip the behavior of the index according to the second semantics above.

You can find the code/patch that does this (with test case) here: https://github.com/johanhedin/mongo/commits/SERVER-2193

With this patch you could create a index like this:

db.collection.ensureIndex({ a: 1, b: 1 }, { sparse: true, sparsePolicy: "include" })

and only documents where both a and b are present will be included in the index. If sparsePolicy is left out (the default) the index will work as before. And of course, the name sparsePolicy is just an suggestion.

I have only addressed v1 indexes but that same seem to be doable for v0 indexes as well if that is desired.

I'm happy to create a pull request if this is something you would consider. For my use case, this would be a HUGE improvement since we are starting to scale from hundreds of millions of documents to hundreds of billions of documents and RAM usage for our indexes is a big issue costing a lot of money for hardware that just store "empty" values.

is related to

SERVER-2193 Sparse indexes only support a single field

Closed

related to

SERVER-10403 sparse compound index should really be sparse

Closed

Assignee:: hari.khalsa@10gen.com
Reporter:: Henrik Ingo (Inactive)
Participants:: hari.khalsa@10gen.com, Henrik Ingo
Votes:: 5 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Apr 29 2014 06:54:05 PM UTC
Updated:: Apr 17 2015 12:10:10 PM UTC
Resolved:: Aug 05 2014 08:15:02 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates