[SERVER-79245] Unclean shutdown while dropping collection and indexes to resync can make the catalog inconsistent Created: 24/Jul/23  Updated: 04/Oct/23  Resolved: 28/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Jordi Olivares Provencio
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-80974 Unclean shutdown while dropping local... Closed
Related
related to SERVER-80974 Unclean shutdown while dropping local... Closed
related to SERVER-81355 Unclean shutdown after untimestamped ... Closed
is related to SERVER-81879 startupRecoveryForRestore can drop ta... Closed
Assigned Teams:
Storage Execution EMEA
Operating System: ALL
Sprint: Execution EMEA Team 2023-09-18, Execution EMEA Team 2023-10-02
Participants:

 Description   

Initial sync will drop all tables in all replicated databases without a timestamp before resyncing. This means that the drop pending ident reaper will immediately drop the table in WT the next time it runs. The table drops in WT are non-transactional and cannot be rolled back. This leads to immediately dropping the table in WT even if the catalog changes are not stable/checkpointed. As a result, during startup recovery, the table no longer exists in WT but continues to exist in the catalog. The server tries to query the index table metadata from WT but WT likely returns ENOENT and we crash.



 Comments   
Comment by Jordi Olivares Provencio [ 28/Sep/23 ]

The problem will be fixed in a more generic way by SERVER-80974

Comment by Jordi Olivares Provencio [ 22/Sep/23 ]

suganthi.mani@mongodb.com I wals also thinking the same as I was working on SERVER-80974. The general solution proposed there is to defer the drop until a checkpoint has occurred for the data.

Comment by Suganthi Mani [ 22/Sep/23 ]

jordi.olivares-provencio@mongodb.com gregory.wlodarek@mongodb.com I filed SERVER-81355 because we are encountering a similar issue with shard merge. My understanding is that there are many places in our codebase where we perform untimestamped drops of internal collections (See SERVER-75740), and all of these places could have the same issue. Looking at PR 15460, it seems that the general idea is that the places performing untimestamped drops should explicitly choose to flush the mdb_catalog writes after committing. However, this solution appears to be risky for a couple of reasons:

1) In the future, engineers might easily overlook the need to enable the flush option.
2) It would be challenging to identify all the places in the codebase where untimestamped drops occur and then update those places to choose the flush option.

Therefore, I believe we should consider a more general solution, such as having the KV ident dropper perform an unstable checkpoint before dropping the ident with a drop timestamp as 0 (i.e., an untimestamped drop). (???)

Comment by Jordi Olivares Provencio [ 05/Sep/23 ]

One solution to this could be to reorder operations such that we first write the catalog changes, flush those changes to disk, THEN perform the drop pending steps. In this manner the state of an ident being present on the catalog but not in WT becomes impossible.

Generated at Thu Feb 08 06:40:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.