-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Cursors, Layered Tables
-
None
-
Storage Engines, Storage Engines - Foundations
-
SE Foundations - Q3+ Backlog
-
None
Currently, state changes, like the transition from follower state to leader state (AKA step up) don't permit cursors to stay open - this requirement is handled by MongoDB. It's likely that this will be relaxed in the future, perhaps letting read cursors to stay open throughout the transition.
Layered cursors are already prepared to handle this, because at every cursor operation, we check the current state of the world, and if the state is changed, we first change the underlying cursors to work in the new state, and then perform the operation as appropriate. For step up, we close the ingest cursor and reopen the stable cursor. The latter is needed because we previously had a readonly cursor at a checkpoint, and we'll want a read/write cursor not tied to a checkpoint.
However, there's a flaw in this - closing the ingest table is only correct when the items in the ingest have "drained" into the stable table. The layered cursor should not close (and stop using) the ingest table until the drain is complete. There is a connection flag WT_CONN_RECONFIGURING_STEP_UP that is set when the step up happens and is cleared when the step up completes - that should be consulted.
A potential alternative would be to notice that the individual layered table has completed its ingest draining. That would allow cursors to be closed on the ingest sooner, which may free up more resources. I don't know if there is a flag on the WT_LAYERED_TABLE that indicates that the individual layered table has completed draining, if not we could add one.