[SERVER-7326] "validate" command should perform full index validation Created: 11/Oct/12  Updated: 05/Dec/16  Resolved: 29/Apr/16

Status: Closed
Project: Core Server
Component/s: Admin, Index Maintenance
Affects Version/s: None
Fix Version/s: 3.3.6

Type: New Feature Priority: Major - P3
Reporter: Tad Marshall Assignee: Robert Guo (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-23050 Add full index validation for the _id... Closed
Documented
is documented by DOCS-9572 Docs for SERVER-7326: "validate" comm... Closed
Duplicate
is duplicated by SERVER-10673 running validate() on index as namesp... Closed
is duplicated by SERVER-23051 Add full index verification dbtests Closed
is duplicated by SERVER-23052 add full index verification for spars... Closed
is duplicated by SERVER-23054 add full index verification for parti... Closed
is duplicated by SERVER-23056 add full index verification to compou... Closed
is duplicated by SERVER-23057 add full index verification to multik... Closed
Related
related to SERVER-23740 validate() should check index key ord... Closed
is related to SERVER-14584 BtreeLogic::Builder generates an inva... Closed
is related to SERVER-19521 validate should check consistency of ... Closed
is related to SERVER-9488 Validate should check more than logic... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 2.7.8, TIG 10 (02/19/16), TIG 11 (03/11/16), TIG 12 (04/01/16), TIG 13 (04/22/16), TIG 14 (05/13/16)
Participants:

 Description   

If a user tries to use the validate command to validate an index, it sort of works, but it generates errors in the server log and incorrectly reports that the index has failed validation.

By "sort of works", I mean that it correctly lists the extents used by the index and correctly reports the datasize and last extent size. It incorrectly reports the number of btree buckets as the number of records. It also generates multiple errors logged on the server and incorrectly provides "advice" : "ns corrupt, requires repair".

It is easy to know that the specified collection is actually an index and to validate it as an index instead of as a normal collection. The validate() command (without

{ full : true }

) should return correct and useful information.

The command "db.indexName.validate(true)" should validate that the index entries actually point to valid records. We currently have no function that does this, and it is not done by "db.collection.validate(true)", presumably because it would be an expensive operation. This would be a very useful addition to the set of available diagnostic features.



 Comments   
Comment by Robert Guo (Inactive) [ 29/Apr/16 ]

Documentation Changes:

validate() is documented on the page: https://docs.mongodb.org/manual/reference/command/validate/

For the description of full: true, we could expand it to something like "new for 3.4: Provide cross validation between indexes and the data in the collection, as well as additional checks on indexes and data".

The current description is: "provides a more thorough scan of the data."

Comment by Robert Guo (Inactive) [ 29/Apr/16 ]

Most of the work for validate() has been done as part of this ticket and SERVER-23055. So I'm going to mark it as resolved.

The one piece remaining is jason.rassi's comment about verifying index key ordering, which will be tracked in SERVER-23740.

Comment by Githook User [ 14/Apr/16 ]

Author:

{u'username': u'guoyr', u'name': u'Robert Guo', u'email': u'robert.guo@10gen.com'}

Message: SERVER-7326 disable validation hook on fuzzer suites
Branch: master
https://github.com/mongodb/mongo/commit/5025aae289bbf1c21972e420832c9e08f0c18815

Comment by Githook User [ 14/Apr/16 ]

Author:

{u'username': u'guoyr', u'name': u'Robert Guo', u'email': u'robert.guo@10gen.com'}

Message: SERVER-7326 Add full validation of all index types
Branch: master
https://github.com/mongodb/mongo/commit/3fcc1b6160866a0a1874b9583a4cb129622cc6a2

Comment by J Rassi [ 23/Jul/14 ]

Converted ticket to feature request.

The "validate" command should contain functionality to perform the following sanity checks for each index in a collection:

  • Verify that the indexed key data is in the correct sorted order.
  • Verify that for each index leaf, there exists a document at the respective diskloc, and the document contains the correct value for the indexed field(s).
  • Verify that for each document in the collection, an index lookup on each indexed value returns the correct diskloc of the document (and no duplicates are returned).
Generated at Thu Feb 08 03:14:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.