Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56766

External sorter can reuse temp files from previous startups

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.0-rc0
    • Affects Version/s: 5.0.0
    • Component/s: None
    • Fully Compatible
    • ALL
    • Execution Team 2021-05-17, Execution Team 2021-05-31
    • 124

      Query users of the external sorter ($group, $bucketAuto, etc) generate unique file names by incrementing a counter that starts at 0 on startup. When an unfinished index build is being resumed during startup, these temporary files are not cleared, and can result in queries spilling and re-using existing files.

      If this happens, the following crash will occur:

      ChecksumMismatch: Data read from disk does not match what was written to disk. Possible corruption of data.
      

      We normally delete the "_tmp" directory at startup, however this will not happen if there are any index builds that need to be resumed after a clean shutdown:

          // If we did not find any index builds to resume or we are starting up after an unclean
          // shutdown, nothing in the temp directory will be used. Thus, we can clear it.
          if (reconcileResult.indexBuildsToResume.empty() ||
              lastShutdownState == StorageEngine::LastShutdownState::kUnclean) {
              LOGV2(5071100, "Clearing temp directory");
       
              boost::system::error_code ec;
              boost::filesystem::remove_all(storageGlobalParams.dbpath + "/_tmp/", ec);
      

      There are few possible solutions:

      • On startup, clear everything in the _tmp directory except for the sorter files needed for resumable index builds
      • Provide an option to the external sorter to truncate new files before writing

            Assignee:
            yuhong.zhang@mongodb.com Yuhong Zhang
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: