-
Type: Task
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
8
-
ASeasonTooMany-2023-08-22
I found this while trying to define the mechanisms required for session initialization to be correct. To be clear I doubt this is actively an issue, as in I don't think there's code that will misbehave if it hits this, i.e. cause a correctness issue. But it could be a real bug. Essentially session initialization has a sequence of steps:
- Update the session_cnt if we are using a new slot (not super relevant but kinda related)
- Initialize the session structure
- Write barrier (This is a publish in our code)
- Set the session->active flag
Currently on the reader side we have:
- Read the conn->session_cnt
- Read barrier
- Read the session->active flag
- Read the session structure
#3 and #4 require a read barrier between them, otherwise the read of the session structure could be reordered before the read of the flag on ARM. Effectively we have a publish one side, without any ordering on the other side.
There are 12 reader loops so in theory we need to introduce 12 read barriers here. Which may impact perf on x86 which doesn't need a read barrier here. We could use the READ_BARRIER_WEAK_MEMORDER construct to avoid that.
I tried to convince myself that because we are in initialization it doesn't matter if the reader reads an older value. But its easier to reason about a correctly ordered reader than one that depends on initialization behaviour. In my opinion.
An example loop is in hazard.c:
bool WT_ORDERED_READ(session_cnt, conn->session_cnt); for (s = conn->sessions, i = max = walk_cnt = 0; i < session_cnt; ++s, ++i) { if (!s->active) continue; // NEED READ BARRIER HERE. hazard_get_reference(s, &hp, &hazard_inuse); //trimmed }
I am not sure what the path forward is here as it will mean these loops have a minimum of two read barriers? We need to find a way to make coding this loop obvious.
I wrote an example ARM64 litmus, but it seems reasonably clear cut?
AArch64 session_init_read { sess_cnt=5;sess_flags=0;sess_active=0; 0:X0=sess_flags;0:X1=sess_active; 1:X0=sess_flags;1:X1=sess_active;1:X2=sess_cnt; } P0 | P1 ; MOV X2, #5 | LDR X4, [X2] ; STR X2, [X0] | dmb ishld ; dmb ishst | LDR X5, [X1] ; MOV X3, #1 | LDR X6, [X0] ; STR X3, [X1] | ; exists (1:X4=5 /\ 1:X5=1 /\ 1:X6=0)
A dmb ishld between LDR X5 and LDR X6 resolves the "bad" state.
- is duplicated by
-
WT-11217 Review the session array maintenance and state tracking in WiredTiger
- Closed