Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.7.4
Affects Version/s: None
Component/s: Storage
Labels:
- rollback-functional

Backwards Compatibility:
Fully Compatible
Sprint:
Repl 2018-04-09
Linked BF Score:
56
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Counts in the WTSizeStorer table are not adjusted in the same transaction that performs an insert/update as that would become a serialization point for concurrent inserts/deletes and would result in an expensive WCE.

Instead, an atomic counter is maintained and flushed every so often to the SizeStorer. Among other things, this data is flushed on clean shutdown. These writes are not timestamped (would be difficult) and thus what's put on disk is the counts for "now" which are unlikely to be the counts as of the stable timestamp. At startup, when replication plays forward the oplog during recovery, an insert that was already accounted for in the sizestorer's view of the data, will be counted again.

The proposed fix, trust the WTSizeStorer to have the proper counts for collections after recovery is played. Specifically:

Introduce state representing the server (or operation context) is in "recovery mode".
- Have `_changeNumRecords` ignore updates when in recovery mode.
- It may also be a good idea to do the same for `_increaseDataSize`.
Step 1, however breaks a special case: when the creation of a collection wasn't included in a stable timestamp, so the collection gets recreated during recovery with an `ident` that's different than the one used at shutdown.
- Introduce more state, the set of collections created during recovery.
- Allow updates to `_changeNumRecords`/`_increaseDataSize` if the collection being updated is in this set.
Another special case comes up when the collection exists in the stable checkpoint, but none of the writes made it into the stable checkpoint. When a collection is empty, as deemed by a cursor "findOne", the record store setup assumes its count should be zero. This is correct in a non-RTT world, but would violate the expectation that the WTSizeStorer is the authority of counts. This code would also needs to be adjusted.

These changes would keep WT counts accurate on clean shutdown, but not on rollback. ~~SERVER-33493~~ is tracking changes for that purpose.

related to

SERVER-34976 clear the "needing size adjustment" set at the beginning of replication rollback

Closed

SERVER-33493 Have WT RTT rollback keep correct counts

Closed

SERVER-33525 Fix replication and sharding tests to work with RTT

Closed

Assignee:: Kyle Suarez (Inactive)
Reporter:: Daniel Gottlieb (Inactive)
Participants:: Daniel Gottlieb, Githook User, Judah Schvimer, Kyle Suarez
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Feb 26 2018 04:59:31 PM UTC
Updated:: Oct 29 2023 10:34:26 PM UTC
Resolved:: Apr 07 2018 03:00:24 PM UTC
Confidence Status Last Update:: 26/Mar/18 7:41 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates