[SERVER-3372] Allow indexing fields of arbitrary length Created: 05/Jul/11  Updated: 22/Mar/23  Resolved: 23/Aug/18

Status: Closed
Project: Core Server
Component/s: Index Maintenance, MMAPv1
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Mathias Stearn Assignee: Xiangyu Yao (Inactive)
Resolution: Duplicate Votes: 19
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-36278 Remove the 1KB index key size limit Closed
is duplicated by SERVER-10749 Query results differ depending on the... Closed
is duplicated by SERVER-6417 duplicate _ids possible when values e... Closed
is duplicated by SERVER-7638 explain() returns different results t... Closed
Related
related to SERVER-12834 Create flag to allow mongod to ignore... Closed
related to SERVER-22078 Remove term list limits for text inde... Closed
related to SERVER-5290 fail to insert docs with fields too l... Closed
is related to SERVER-14791 Mongorestore error recreating index c... Closed
is related to SERVER-2633 getLastError and Btree::insert failure Closed
is related to SERVER-19281 Add index name as a separate field on... Closed
is related to SERVER-30188 Index option to skip keys that are to... Closed
Sprint: Storage NYC 2018-08-27
Participants:
Case:

 Description   

The current index format has a maximum key length limitation and starting in 2.6, prevents inserting new documents that contain indexed fields exceeding this length.



 Comments   
Comment by Eric Milkie [ 12/Feb/18 ]

There are in fact quite a few limitations, including this one, that will be relaxed once the MMAPv1 storage engine is fully retired. We don't want to make the limit configurable for other storage engines right now because it can break mixed storage engine replica sets in a way that is difficult to prevent in code, and one of the easiest ways to move from MMAPv1 to WiredTiger is indeed to use mixed storage engine replica sets.

Comment by David Bartley [ 12/Feb/18 ]

I believe the existing limitation is a holdover from the mmapv1 engine, related to how it stored its B-Tree entries? My understanding from reading the WiredTiger code is that this limit is purely artificial.

Is it possible to just remove this limit for only the WiredTiger engine, with a note of this in the documentation, or at least make the limit configurable for WiredTiger? Obviously if someone rebuilds onto a different engine (i.e. mmapv1) there will be problems, and that should be noted in docs, but it seems unreasonable that MongoDB should be limited by its legacy storage engine.

Comment by Yatish Joshi [ 20/Sep/17 ]

Can someone give an update on when this can be expected?

Comment by Jake Dempsey [ 19/Aug/16 ]

Has there been any movement on this issue? I am really in a tight spot. In our application there are millions of nodes in the tree and querying the tree and subtrees fast is critical. We also perform billions of db transactions on the tree and we have just found the materialized path approach to be the best mechanism for our needs. Is it possible to at least increase the allowed length? Even just doubling the allowed key length would help us immensely.

Comment by Jake Dempsey [ 25/Feb/15 ]

I'd love to see the this increased at a minimum. In my current application where we have to build trees with 1M+ nodes, we leverage the materialized path approach and even with using short descriptors we are limited to a tree depth of ~75. We are already approaching 60 levels deep in the tree. This limit is causing us to have to consider a different vendor because in the near future our application will just not work b/c of this limit.

Comment by Mark Callaghan [ 18/Jul/14 ]

Will this work with pluggable engines that impose a smaller limit on max key length? For example InnoDB keys must be less than half of a page.

Comment by Scott Hernandez (Inactive) [ 16/Dec/11 ]

The index-spec has the key/field names. The index entries don't store
the field name, it is true.

Comment by Scott Hernandez (Inactive) [ 04/Nov/11 ]

Right now those keys can't be indexed and aren't covered anyway. This is no worse in that case.

Comment by Guanqun Lu [ 04/Nov/11 ]

If this gets implemented, how do we deal with covered index? We'd lose the gain from SERVER-192.

Comment by Guanqun Lu [ 21/Oct/11 ]

The indexes doesn't contain the column name. These column names are stored in namespaces.

Comment by Geoffrey Gallaway [ 10/Oct/11 ]

Is there some way to figure out how many documents in a collection have keys that will exceed the maximum index size? db.collection.find({$where: "this.something.length > 800"}) is very vague. Do indexes contain the column names, etc...

Comment by Dwight Merriman [ 29/Jul/11 ]

would need to point to start of the document in the collection rather than its exact position as exact field positions move over time when updated

Generated at Thu Feb 08 03:02:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.