[SERVER-83508] Race between watchdog and FCBIS deleting old storage files Created: 21/Nov/23  Updated: 08/Feb/24

Status: In Code Review
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Huayu Ouyang Assignee: Huayu Ouyang
Resolution: Unresolved Votes: 0
Labels: repl-shortlist
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-83510 Audit FCBIS for potential issues arou... Open
is related to SERVER-66444 File Copy Based Initial Sync is incom... Closed
Assigned Teams:
Replication
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0, v6.0
Steps To Reproduce:

Start initial sync node with FCBIS and watchdog on, and hang it after deleting the old storage files.

Sprint: Repl 2024-01-08, Repl 2024-01-22, Repl 2024-02-05, Repl 2024-02-19
Participants:
Linked BF Score: 164

 Description   

When we start the watchdog, we add the directories to be checked to a list of directories. This is added as just the filepath to the directory. For example, we might add the journal directory here. Then, the watchdog runs in a loop every x seconds, and checks the directories. As part of FCBIS, we switch the storage location. So the watchdog should get interrupted, but as part of SERVER-66444 we just continue with the loop if that happens, and the list of directories to be checked remains the same. Then FCBIS deletes the existing local files used by WT (including the journal directory) here. Then the watchdog would try to open files in the now-deleted directory, resulting in an fassert.



 Comments   
Comment by Githook User [ 08/Feb/24 ]

Author:

{'name': 'Huayu Ouyang', 'email': 'huayu.ouyang@mongodb.com', 'username': 'huayu-ouyang'}

Message: SERVER-83508 Fix race between watchdog and FCBIS deleting old storage files

GitOrigin-RevId: d1f4d26d0141d5bec652179faa47640ef4ad87d8
Branch: master
https://github.com/mongodb/mongo/commit/a44e4b8a4875da0eee5f11b835118a17ba448675

Comment by Githook User [ 07/Feb/24 ]

Author:

{'name': 'Uladzimir Makouski', 'email': 'uladzimir.makouski@mongodb.com', 'username': 'umakouski'}

Message: Revert "SERVER-83508 Fix race between watchdog and FCBIS deleting old storage files"

This reverts commit dc9cd11dfa5b1515656e7f96a0a7fef4c6aae261.

GitOrigin-RevId: 27b4a0b9983696f0792aaf8c6064ae1ff457ee77
Branch: master
https://github.com/mongodb/mongo/commit/deefe379c9baedaac2553456f684e92ed6e0a95a

Comment by Githook User [ 06/Feb/24 ]

Author:

{'name': 'Huayu Ouyang', 'email': 'huayu.ouyang@mongodb.com', 'username': 'huayu-ouyang'}

Message: SERVER-83508 Fix race between watchdog and FCBIS deleting old storage files

GitOrigin-RevId: dc9cd11dfa5b1515656e7f96a0a7fef4c6aae261
Branch: master
https://github.com/mongodb/mongo/commit/2fb550f1805d575224ae2c1f2860fd9819a64b56

Generated at Thu Feb 08 06:52:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.