-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The "equal step" sampling algorithm introduced by SERVER-114237 does not perform well for very skewed Timestamp distributions.
When running one of the existing tests TruncatesAreOnlyAfterAllDurableReplicatedTruncates in file src/mongo/db/change_stream_pre_images_remover_test.cpp, the test will fail when enabling the "equal step" sampling algorithm.
The reason is that the test creates several documents with Timestamp values close to Timestamp(1, 0), and then bumps the majority-committed Timestamp to Timestamp(4294969, 2) and creates another document.
It then expects the pre-images collection to be cleared after a few invocations of the pre-images removal job.
The test works well when the "equal step" sampling algorithm is not enabled. Then it chooses to scan the entire pre-images collection, and the information about the Timestamp distribution inside the collection is 100% accurate.
When choosing the "equal step" algorithm, the sampling will only find the two documents with the lowest and highest Timestamp, which are Timestamp(1, 2001) and Timestamp(4294969, 3). The documents in-between (which are all at the very low end of the range) are not found.
- is caused by
-
SERVER-114237 Implement Timestamp based pre-image sampling for truncation
-
- Closed
-