Details
-
Bug
-
Resolution: Fixed
-
Major - P3
-
None
-
None
-
None
-
Fully Compatible
-
ALL
-
Execution EMEA Team 2023-10-30
-
5
Description
By default, internal secondary readers read at 'lastApplied'. By reading at 'lastApplied' by default, the tenant's truncate marker initialization can miss a pre-image insert if initialization completes in the middle of oplog batch application on the secondary.
The consequence: If an inserted pre-image has a higher RecordId that the highest RecordId tracked by the truncate markers, the pre-image won't be removed until the next insert.
For example, suppose:
- The 'lastApplied' timestamp is at TS(5), captured in snapshot0.
- Truncate marker generation begins, sees 4 documents with nsUUID0 at snapshot0.
- Secondary oplog batch application inserts a 5th pre-image to nsUUID0 at TS(10), but oplog application is still between batches and lastApplied hasn't been advanced.
- The insert updates the 'tenantMapEntry', serving as a placeholder until truncate marker generation is complete, with a new set of truncate markers for nsUUID0 that have _lastHighestRecordId encoding TS(10).
- The 'tenantMapEntry' placeholder truncate markers for nsUUID0 are overwritten because generatedTruncateMarkers also had a set of truncate markers for nsUUID0.
- Even after abandoning snapshot0, the new snapshot still reads at lastApplied TS(5) because the oplog batch application hasn't completed yet.
- Thus, the truncate markers never track the 5th document inserted into the pre-images collection.
- If the truncate markers for nsUUID0 don't include the 5th pre-image, the 5th pre-image won't be removed until another pre-image for nsUUID0 is inserted, and updates the _lastHighestRecordId.