[SERVER-3372] Allow indexing fields of arbitrary length Created: 05/Jul/11  Updated: 19/Jun/18

Status: Open
Project: Core Server
Component/s: Indexing, MMAPv1
Affects Version/s: None
Fix Version/s: 4.1 Required

Type: Improvement Priority: Major - P3
Reporter: Mathias Stearn Assignee: Backlog - Storage Team
Resolution: Unresolved Votes: 18
Labels: None

Issue Links:
is duplicated by SERVER-10749 Query results differ depending on the... Closed
is duplicated by SERVER-6417 duplicate _ids possible when values e... Closed
is duplicated by SERVER-7638 explain() returns different results t... Closed
related to SERVER-12834 Create flag to allow mongod to ignore... Closed
related to SERVER-22078 Consider raising or removing term lis... Open
related to SERVER-5290 fail to insert docs with fields too l... Closed
is related to SERVER-14791 Mongorestore error recreating index c... Closed
is related to SERVER-19281 Add index name as a separate field on... Open
is related to SERVER-2633 getLastError and Btree::insert failure Closed
is related to SERVER-30188 Index option to skip keys that are to... Closed
Epic Link: Remove index key length limit


The current index format has a maximum key length limitation and starting in 2.6, prevents inserting new documents that contain indexed fields exceeding this length.

Comment by Dwight Merriman [ 29/Jul/11 ]

would need to point to start of the document in the collection rather than its exact position as exact field positions move over time when updated

Comment by Geoffrey Gallaway [ 10/Oct/11 ]

Is there some way to figure out how many documents in a collection have keys that will exceed the maximum index size? db.collection.find({$where: "this.something.length > 800"}) is very vague. Do indexes contain the column names, etc...

Comment by Guanqun Lu [ 21/Oct/11 ]

The indexes doesn't contain the column name. These column names are stored in namespaces.

Comment by Guanqun Lu [ 04/Nov/11 ]

If this gets implemented, how do we deal with covered index? We'd lose the gain from SERVER-192.

Comment by Scott Hernandez [ 04/Nov/11 ]

Right now those keys can't be indexed and aren't covered anyway. This is no worse in that case.

Comment by Scott Hernandez [ 16/Dec/11 ]

The index-spec has the key/field names. The index entries don't store
the field name, it is true.

Comment by Mark Callaghan [ 18/Jul/14 ]

Will this work with pluggable engines that impose a smaller limit on max key length? For example InnoDB keys must be less than half of a page.

Comment by Jake Dempsey [ 25/Feb/15 ]

I'd love to see the this increased at a minimum. In my current application where we have to build trees with 1M+ nodes, we leverage the materialized path approach and even with using short descriptors we are limited to a tree depth of ~75. We are already approaching 60 levels deep in the tree. This limit is causing us to have to consider a different vendor because in the near future our application will just not work b/c of this limit.

Comment by Jake Dempsey [ 19/Aug/16 ]

Has there been any movement on this issue? I am really in a tight spot. In our application there are millions of nodes in the tree and querying the tree and subtrees fast is critical. We also perform billions of db transactions on the tree and we have just found the materialized path approach to be the best mechanism for our needs. Is it possible to at least increase the allowed length? Even just doubling the allowed key length would help us immensely.

Comment by Yatish Joshi [ 20/Sep/17 ]

Can someone give an update on when this can be expected?

Comment by David Bartley [ 12/Feb/18 ]

I believe the existing limitation is a holdover from the mmapv1 engine, related to how it stored its B-Tree entries? My understanding from reading the WiredTiger code is that this limit is purely artificial.

Is it possible to just remove this limit for only the WiredTiger engine, with a note of this in the documentation, or at least make the limit configurable for WiredTiger? Obviously if someone rebuilds onto a different engine (i.e. mmapv1) there will be problems, and that should be noted in docs, but it seems unreasonable that MongoDB should be limited by its legacy storage engine.

Comment by Eric Milkie [ 12/Feb/18 ]

There are in fact quite a few limitations, including this one, that will be relaxed once the MMAPv1 storage engine is fully retired. We don't want to make the limit configurable for other storage engines right now because it can break mixed storage engine replica sets in a way that is difficult to prevent in code, and one of the easiest ways to move from MMAPv1 to WiredTiger is indeed to use mixed storage engine replica sets.

Generated at Thu Jun 21 06:50:42 UTC 2018 using JIRA 7.8.2#78002-sha1:944b71ecbe2e09c23503821098ef280c785b44a8.