Details
-
Improvement
-
Resolution: Unresolved
-
Major - P3
-
None
-
None
-
None
-
Replication
Description
Today, we shut down a secondary if it encounters an unappliable op in the replication stream. For example, an op generated by a PRIMARY running a newer version of MongoDB may indicate a metadata change with a property that the secondary's version does not support.
Instead of shutting down, we could instead transition to RECOVERING and stay there until the offending op has been reversed by a subsequent delete or removal op.
For example, if a create-index op indicates a "v" version field that the secondary does not support, it would skip that op and transition to RECOVERING. It would continue to process ops. Eventually, it may encounter a drop-index op for the same index; after processing that op, the node would transition back into SECONDARY.
This work requires that the list of "bad operations" that need to be undone, be stored durably. Once the list becomes empty, it is safe to transition out of RECOVERING.