-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Checkpoints, Metadata
-
None
-
Storage Engines
-
None
-
None
When a follower picks up a checkpoint, it currently iterates over the entire shared metadata table and copies/updates the entries in the local metadata table accordingly. This works well for a small to moderate number of tables, but it would not scale to a large number of tables, especially given short checkpoint pick up frequency.
We should investigate the problem and implement an appropriate solution. Here are at least two ways in which we could solve it:
- Implement the metadata cursor as a union cursor over the shared and the local metadata table, similarly to layered cursors. This would allow us to skip copying updated entries from the shared metadata table, but it would be a complex solution, especially as operations such as creating tables by the follower and dropping tables by either a leader or follower would require careful thought. This would probably also require us to create ingest tables on the follower on demand (instead of while picking up a checkpoint).
- Augment the shared metadata table with a log of changes, keeping maybe several minutes (or even hours!) of changes. When picking up a checkpoint, the follower can process the log of updates if the log reaches far enough back. If not, the follower can simply fall back to the current mechanism of iterating over the whole shared metadata table. The latter should be perfectly acceptable, because a node that would have to do that would be probably very far back anyways. This would be probably an easier solution.