Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5290

fail to insert docs with fields too long to index, and fail to create indexes where doc keys are too big

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.5
    • Affects Version/s: None
    • Component/s: Index Maintenance
    • None

      When writes default to getting errors, we should just fail fast.

      Behavior when a document with an index key is found to exceed the maximum key size:

      • insert new document with over-size index key fails with error msg. The document is not inserted.
      • update existing document with over-size index key fails with error msg. The existing document remains unchanged.
      • ensureIndex / reIndex on a collection with over-size index key fails with error msg. The index is not created.
      • compact on a collection with over-size index key succeeds. Documents with over-size keys are not inserted into the index.
      • mongorestore / mongoimport with indexed values too large rejects objects which do not suit. Effectively the result of doing an insert of each individual object.

      Behavior on secondary nodes:

      • New replica set secondaries will insert document and build indexes on initial sync with an warning in the logs.
      • Replica set secondaries will replicate documents insert to a 2.4 primary, but print an error msg in the log.
      • Replica set secondaries will update documents updated on a 2.4 primary, but print an error msg in the log.

      OLD DESCRIPTION:
      When inserting a new document, if an indexed field is too long to store in the btree, we don't add the doc to the index, but do store the doc. This leads to peculiar behaviors (examples below). It would be good to have a mechanism to make these insertions just be errors and not store the document (we already do this for unique indexes, after all, so programmers using fire and forget can't really expect the docs to be present if they haven't checked).

      // Create a document with too long an _id, but note that
      // any indexed field can evince this problem.
      var s="a"; for (i=0;i<10;i++) s+=s; print(s.length);
      print(s+=s; s.length);
      db.foo.insert({ _id : s});
      
      // You can find the document if you do a tablescan
      db.foo.find();
      db.foo.count();
      
      // But you can't find the document if you use the _id index.
      db.foo.find().hint({_id:1});
      
      // You also can't find the document if you're looking for it
      // by _id:
      db.foo.find({ _id : s });
      db.foo.find({ _id : s }).hint({ $natural : 1});
      

      The fix for this issue must encompass both insert/update as well as failing during ensureIndex calls if this condition is violated (similar to a unique index constraint failing).

      Need to think hard about how this will work when a user upgrades a node with indexes built on top of invalid data. When they re-sync a new replset member the index create step would fail.

            Assignee:
            eliot Eliot Horowitz (Inactive)
            Reporter:
            richard.kreuter Richard Kreuter (Inactive)
            Votes:
            2 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: