SBE: improve stage debugPrint() output

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      Current SBE debugPrint() output, which is also shown by explain(), is not interpretable without the source code. Example sbe::ScanStage::debugPrint() output:

      [0] scan s1 none none none none none [] @"<collUUID>" true false

      1. "s1 none none none none none" is info about which slots actually exist out of all slots that might exist for this stage.
      2. "true false" is info about the scan direction and whether the oplogTs slot exists (unknown why this is not treated the same as other slots).

      In #1 the reader needs the source code to discover that after the "scan" keyword the next bunch of fields are telling either the SlotId of a slot that exists for some specific value, or "none" meaning the slot in that position does not exist. The position of a given slot is not consistent though as the first slot that might be printed, for the seek record ID, does not get "none" printed if it does not exist, so the positions of a later slot might be either N or N+1. E.g. the record slot might be either position 1 or 2, the recordId slot might be position 2 or 3, etc., and without the source code the reader does not have any information about which slot is in any position.

      In #2 the reader needs to know that the "true" means this is a forward scan and the "false" means the oplogTs slot does not exist.

      This ticket is to improve the debug output of all SBE stages that use this paradigm for identifying slots. A more readable output could be something like the following, reporting the same information as in the above example but in more easily understood form:

      • [0] scan rec[s1] [] @"<collUUID>" forward

      The approach here is based on the ideas of:

      1. Do not print anything about slots that do not exist. Non-existent slots do not play a part in the stage's operation, so listing them just reduces the signal-to-noise ratio.
      2. For each slot that does exist, print them as "mnemonic[SlotId]" or similar compact syntax, e.g. above "rec[s1]" means the record slot exists and is slot s1. Suggested mnemonics:
        1. seek - seekRecordId slot
        2. min - minRecordId slot
        3. max - maxRecordId slot
        4. rec - record slot
        5. recId - recordId slot
        6. snapId - snapshotId slot
        7. idxId - indexId slot
        8. idxKey - indexKey slot
        9. idxKeyPat - indexKeyPattern slot
        10. oplogTs - oplogTs slot (treat this slot same as the others instead of having a different, custom treatment)
      3. Print meaningful keywords instead of "true" or "false" for other info. Suggested keywords:
        1. forward, reverse - scan direction
        2. random - print if random cursor is used
        3. lowPriority - print if it is a low priority scan

      The only stage debugPrint() methods that currently print "none" (DebugPrinter::kNoneKeyword) are:

      1. ColumnStageScan::debugPrint() - column_scan.cpp
      2.  IndexScanStageBase::debugPrintImpl() - ix_scan.cpp
      3. SimpleIndexScanStage::debugPrint() - ix_scan.cpp
      4. ScanStage::debugPrint() - scan.cpp
      5. ParallelScanStage::debugPrint() - scan.cpp

      so these are the primary debug printers needing to be addressed. The other debug printers need to be checked for outputting unlabeled true/false values.

      There is already an implementation of these changes for ScanStage::debugPrint() in commit #40 of PR 10981 for SERVER-74521 at this direct link:

      https://github.com/10gen/mongo/pull/10981/commits/832fbc767fa9c81fd1c9004636b3f59090bf8438

      However it was removed from this PR in commit #47 due to a desire to keep output consistent across stages, whereas the original PR is for a ticket that is only updating ScanStage.

      This will be a very quick project with big payoffs in developer productivity and could be triaged as a New Engineer ticket.

      Also, customers can see this "debug" output via explain() and without the source code, there is no way for them to understand the current output. This ticket will help improve the customer experience and potentially reduce Support tickets.

      FYI kyle.suarez@mongodb.com martin.neupauer@mongodb.com david.storch@mongodb.com amr.elhelw@mongodb.com 

            Assignee:
            [DO NOT USE] Backlog - Query Execution
            Reporter:
            Kevin Cherkauer (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: