The snapshot transactions' atClusterTime selection algorithm developed as part of the Global Point in Time reads project for 4.0 works as follows:
- Perform routing using the latest available routing table on MongoS
- Using the per-shard majority committed timestamp, select the smallest timestamp across the targeted shards (this is in order to ensure none of the targeted shards will perform a no-op write)
- Perform routing at the timestamp selected in the previous step and if this results in the same set of shards, use the selected timestamp. Otherwise use the latest available timestamp on the logical clock
This algorithm was disabled through
SERVER-34326 (and deleted by SERVER-34475) due to the small window of history that the storage engine supports. Now that SERVER-31767 has increased the window of timestamps available for atClusterTime reads, the above selection algorithm can be re-enabled.
In addition, using the majority committed timestamp goes against the performance goals of the speculative snapshot optimization and can also lead to problems on shards with enableMajorityReadConcern=false (where such shards may never provide a snapshot at the selected atClusterTime). Because of this, the algorithm should be changed to use the last applied opTime timestamps of the targeted shards.