Investigate and improve error handling for __conn_config_file

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines - Foundations
    • None
    • None

      At the end of __conn_config_file, any EINVAL from reading the basecfg gets flagged as data corruption and turned into WT_TRY_SALVAGE:

      err:
          WT_TRET(__wt_close(session, &fh));
      
          /**
           * Encountering an invalid configuration string from the base configuration file suggests
           * that there is corruption present in the file.
           */
          if (!is_user && ret == EINVAL) \{
              F_SET_ATOMIC_32(S2C(session), WT_CONN_DATA_CORRUPTION);
              return (WT_ERROR);
          }
      
          return (ret);
      

      This implies that any EINVAL error means the file was broken on disk, but that assumption is no longer true. Several calls on this path can return EINVAL for reasons that are not corruption: for example, {}wt_filesize _and }}{{wt_read __ can surface OS-level EINVALs, and _wt_config_check returns EINVAL for conditions like an unknown key or an out-of-range value that the current binary no longer accepts.

      WT-17248 is a recent example: a basecfg written by an older binary with chunk_cache enabled was opened by a newer binary where the field has been removed, which caused _wt_config_check to fail with EINVAL and the mapping to surface it to the user as data corruption. The fix had to insert an explicit pre-check that returns ENOTSUP before {}_wt_config_check runs, just to keep the mapping from firing.

      This ticket aims to investigate the paths that produce EINVAL in __conn_config_file and improve the error handling with more accurate reporting, likely by narrowing the mapping to cases that are genuinely parse-level corruption (unbalanced brackets, bad escapes, etc.), or by dropping it entirely if the underlying error messages are already actionable enough on their own.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Jasmine Bi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: