[SERVER-56766] External sorter can reuse temp files from previous startups Created: 07/May/21  Updated: 29/Oct/23  Resolved: 18/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.0
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Yuhong Zhang
Resolution: Fixed Votes: 0
Labels: post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2021-05-17, Execution Team 2021-05-31
Participants:
Linked BF Score: 124

 Description   

Query users of the external sorter ($group, $bucketAuto, etc) generate unique file names by incrementing a counter that starts at 0 on startup. When an unfinished index build is being resumed during startup, these temporary files are not cleared, and can result in queries spilling and re-using existing files.

If this happens, the following crash will occur:

ChecksumMismatch: Data read from disk does not match what was written to disk. Possible corruption of data.

We normally delete the "_tmp" directory at startup, however this will not happen if there are any index builds that need to be resumed after a clean shutdown:

    // If we did not find any index builds to resume or we are starting up after an unclean
    // shutdown, nothing in the temp directory will be used. Thus, we can clear it.
    if (reconcileResult.indexBuildsToResume.empty() ||
        lastShutdownState == StorageEngine::LastShutdownState::kUnclean) {
        LOGV2(5071100, "Clearing temp directory");
 
        boost::system::error_code ec;
        boost::filesystem::remove_all(storageGlobalParams.dbpath + "/_tmp/", ec);

There are few possible solutions:

  • On startup, clear everything in the _tmp directory except for the sorter files needed for resumable index builds
  • Provide an option to the external sorter to truncate new files before writing


 Comments   
Comment by Githook User [ 18/May/21 ]

Author:

{'name': 'Yuhong Zhang', 'email': 'danielzhangyh@gmail.com', 'username': 'YuhongZhang98'}

Message: SERVER-56766 External sorter can reuse temp files from previous startups
Branch: master
https://github.com/mongodb/mongo/commit/f62706daa8f17a2338b98860b35aac66c376bfb6

Generated at Thu Feb 08 05:40:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.