[SERVER-37811] Replication rollback invalidates all sessions with retryable writes, not just the rolled-back ones Created: 30/Oct/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.0.3, 4.1.4 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | former-quick-wins, gm-ack | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||
| Sprint: | Repl 2019-02-11 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
The rollback handling code here seems to have the exact list of the sessions, which rolled-back, but despite this it invalidates all the sessions in the catalog. |
| Comments |
| Comment by Pavithra Vetriselvan [ 13/Feb/19 ] |
|
In OpObserverImpl::onReplicationRollback, we end up calling MongoDSessionCatalog::invalidateSessions and pass in boost::none for a single session doc. This causes us to go through and invalidate all sessions. Since we already have the rollbackSessionIds, which tracks sessions where operations were rolled back, we should be able to iterate over this and call invalidateSessions with each session ID. Alternatively, we could do the iterations inside invalidateSessions if we pass in the rollbackSessionIds as a parameter. |
| Comment by Gregory McKeon (Inactive) [ 05/Nov/18 ] |
|
Putting this here for now to revisit once prepare w/ rollback decides when we need to invalidate the sessions table. |
| Comment by Jack Mulrow [ 30/Oct/18 ] |
|
judah.schvimer, yeah like Randolph said I think we did it this way because roll back to a checkpoint didn't track rolled back sessions at the time and it wasn't worthwhile to add that ourselves since we didn't support retryable writes against nodes that aren't primary. It looks like we started tracking affected sessions in |
| Comment by Randolph Tan [ 30/Oct/18 ] |
|
I think this was for rollback to checkpoint. Since we can't tell which what's the diff, we'll have to force the in memory sessions to just load everything from storage. |
| Comment by Judah Schvimer [ 30/Oct/18 ] |
|
This was done in |