|
The DbCheckHasher class today implements the hashing logic for the data consistency check in dbCheck. Since it already has access to each document by walking the _id index, we will implement the missing index keys check within the DbCheckHasher.
The algorithm for the missing index keys check will be:
*Primary iterates through record store at a given start point until it hits a batch boundary (total number of bytes/records limitation hit)
For each record traversed in batch and for each index that we are checking:
- Fetch all index key entries corresponding to its index field values and record ID
- Look up each of those index keys in the index table
- If we do not find the index key entry, report the inconsistency to the health log
- We will also verify that the document complies with index settings (for example, it does not have multiple index keys if the index is not multikey)
- Add the size of the document and index keys to the total count of bytes. If we have exceeded a speed parameter, complete the batch and return
Secondaries will determine batch boundaries from an oplog entry that primary writes
This is currently already done by the DbCheckHasher
|