MongoR needs to improve transparency on its progress and liveness during replay.
Logging is useful for historical diagnostic data, but is not ideal for verifying the ongoing state of a replay.
This should be touched periodically to allow an external watchdog to identify a mongor instance which is alive, but "stuck" in some manner.
The file may contain a snapshot of basic statistics about the replay -
- Commands issued
- Current live session count
- Error count (to be more rigorously defined - should not simply count command errors and this would include expected errors i.e., those which match the original replay)
- "Total time delayed" - an indication of if a replay is falling behind significantly from the target schedule
CC: david.kupiec@mongodb.com - feel free to add any requirements that will simplify the consumption of this feature.