Due to Prepare Support for Transactions and Larger Transactions Than 16MB, we cannot truncate transaction oplog entries if their commit oplog entry isn't stable yet.
The current solution is to pass the current oldest active transaction to setStableTimestamp(), so that we have to read the oldest active transaction at the stable timestamp, the oldest required timestamp, every time we set the stable timestamp. To save this read, we could let the storage layer call back the replication system when it's about to start a checkpoint. Replication will read the oldest required timestamp or calculate this timestamp in other ways, then return the timestamp to storage. After the checkpoint, storage uses the oldest required timestamp to let oplog truncation thread know where it can truncate up to.
Passing the oldest required timestamp to storage can be done asynchronously if that makes storage work easier.