[SERVER-9347] Allow sparse multi-key entries in a compound index Created: 13/Apr/13 Updated: 04/Nov/15 Resolved: 04/Nov/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | 2.2.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | James Blackburn | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Description |
|
This is similar to Given the following GridFS like schema: Files are large, and we support multiple versions of files. We therefore chunk files and sha the chunks such that we can save disk space when most the file hasn't changed. The chunks collection has a trivial unique index:
'symbol' / file name can be used for sharding (not strictly necessary) For every version of a file, ('parent', 'chunk') must be unique. Now this works great, it's fast, you can easily slice out ranges of the file, provides version control, and space savings when most data stays the same between versions. However a problem arises when you try to delete. If the parents array becomes empty for more than one chunk-version, the unique constraint is violated as (null, 'chunk') can result in duplicates. For example:
It's great that arrays work as multi-key indexes. However it's less great that the empty array is given a special 'undefined' value. I can't see how it's useful for documents in a compound unique index, which contains a multi-key field, to be included when that multi-key field is empty. Certainly sparse could reasonably ignore empty multi-key documents in compound indexes. |
| Comments |
| Comment by Daniel Pasette (Inactive) [ 04/Nov/15 ] | |
|
Partial indexes (coming in 3.2.0) can be used here. Using the following index description, you can ensure that only documents whose parents array is non-empty will be included in the index. There isn't currently a better way to express "non-empty array" in the query language. A possible query language extension would be to add a $size operator.
Since this can be expressed with partial indexes and we don't want to make a backwards breaking change to indexing semantics, I'm going to close this as a duplicate of the partial indexes ticket. Please comment if you disagree. | |
| Comment by Matt DeKrey [ 27/Jul/13 ] | |
|
I encountered this issue as well while using the C# driver, but wrote a work-around. Pseudo code: I hope this helps anyone else that runs across this issue. |