[SERVER-38606] Stop allowing NamespaceNotFound errors during startup replication recovery. The oplog replay logic will abort on NamespaceNotFound errors while applying CRUD operations. Created: 13/Dec/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Benety Goh | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Sprint: | Storage NYC 2019-01-28, Execution Team 2019-09-09, Execution Team 2019-09-23 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
With the old 4.0-style two phase drop, if the server crashes after the actual WT table drop and before a newer checkpoint has been generated (see below), after restart we will have a state where collection is in mdb_catalog but not backed by any WT tables. Therefore we chose to allow NamespaceNotFound errors in replication recovery. After we've done the new 4.2-style two phase drop, this error should never happen during replication recovery because the actual WT table drop will always happen after a stable checkpoint which includes the mdb_catalog changes. |
| Comments |
| Comment by Xiangyu Yao (Inactive) [ 29/Jan/19 ] |
|
Taking it out of the project and putting it to the backlog because the condition to selectively relax constraint will be different when the enhancement of two phase drop is done. We should revisit this ticket when we do the enhancement. |
| Comment by Xiangyu Yao (Inactive) [ 28/Jan/19 ] |
|
Yes you are right. |
| Comment by Benety Goh [ 28/Jan/19 ] |
|
Is this also dependent on storage engine support for pending idents and checkpoints? |
| Comment by Xiangyu Yao (Inactive) [ 25/Jan/19 ] |
|
We should check FCV to selectively relax this constraint because FCV 4.0 may indicate that dropCollection was done in 4.0-style (rename) so NamespaceNotFound may still be trigger during replication recovery. |