The 'valid' flag response for the validate cmd w/ repair can be inaccurate in some cases when a duplicate record is deleted

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Execution
    • Execution Team 2023-05-01
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The validate command returns incorrect 'valid'==true/false responses when a duplicate record is deleted. Deleting a duplicate record implicitly deletes index entries and causes validate state's index entry counts to become incorrect. The 'valid' response can be set to false when checks are made that the number of index entries and the number of record entries make sense for a given index type.

      There are a few additional count tracking inaccuracies in the code that were fixed in the PR in the comments. But identifying which indexes are implicitly affected by a record deletion is a tricky problem and needs significant refactoring in code with significant techdebt. The PR fixes alone would just shift the cases around when 'valid' is incorrectly true/false and do not fully address the bug.

       

      When missing index entries are identified as duplicate documents in validation repair mode, the duplicate document is deleted from collection and moved to a local lost and found. deleteDocument will call _unindexKeys in index_catalog_impl to remove the record being deleted from the indexes it is in. When a duplicate document is missing from an index, we want to ensure that the matching index key of the original document is not unindexed. The index the duplicate document is missing from should be unchanged when the duplicate is deleted from collection. This part will be done in SERVER-50081.

      For validation repair mode, we want to know what indexes the record was removed from and update the respective index key num counts accordingly.

            Assignee:
            [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            Shin Yee Tan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: