[SERVER-40013] upgrade downgrade support for config.transactions startTimestamp field Created: 07/Mar/19  Updated: 02/Oct/19  Resolved: 17/Apr/19

Status: Closed
Project: Core Server
Component/s: Replication, Upgrade/Downgrade
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Judah Schvimer Assignee: Jason Chan
Resolution: Done Votes: 0
Labels: bigtxns_upgrade_downgrade, todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Cloners
clones SERVER-39850 Upgrade/downgrade for oldest active t... Closed
Depends
depends on SERVER-39680 Maintain the oldest active transactio... Closed
Related
related to SERVER-42540 Complete TODO listed in SERVER-40013 Closed
related to SERVER-43435 Complete TODO listed in SERVER-40013 Closed
is related to SERVER-36498 Remove 'config.transactions' entries ... Closed
Sprint: Repl 2019-04-22
Participants:

 Description   

SERVER-39680 introduced a startTimestamp field. We need to add upgrade/downgrade support similar to SERVER-36498 since lower versions won't recognize it and 4.2 won't have it on upgrade.



 Comments   
Comment by Githook User [ 02/Oct/19 ]

Author:

{'username': 'jasonjhchan', 'email': 'jason.chan@mongodb.com', 'name': 'Jason Chan'}

Message: SERVER-42540 Remove TODO from SERVER-40013
Branch: master
https://github.com/mongodb/mongo/commit/27dab1976592948aabfffdfdb98b1f8313bb8741

Comment by Jason Chan [ 17/Apr/19 ]

Closing ticket as there is no work needed after closer evaluation.

Comment by Siyuan Zhou [ 17/Apr/19 ]

SGTM. Great summary!

Comment by Judah Schvimer [ 17/Apr/19 ]

Ah that makes sense. SGTM to leave out the invariant.

Comment by Jason Chan [ 17/Apr/19 ]

judah.schvimer

I accidentally left out another potential reason why the invariant to check only the in-memory transaction state wouldn't work from my previous comment. 4.0 format transactions have an inProgress in-memory state but no on-disk state so I don't see a way to verify that there are no longer any in-progress 4.2 format transactions without directly looking at the config.transactions table.

Siyuan was comfortable with foregoing the invariant if adding one required reformatting/reworking parts of the code that might make it less intuitive to the readers. I don't see a convenient place to stick the invariant without rewriting parts of forEachSessionWithCheckout and doing a disk look-up directly. Do you have any thoughts?

Comment by Judah Schvimer [ 17/Apr/19 ]

That all makes sense!

validation function here only checks for the in-memory state

The in-memory state should reflect the on-disk state or be refreshed from disk implicitly on session checkout. I think adding such an invariant, especially on such a rare path would be valuable.

Comment by Jason Chan [ 17/Apr/19 ]

judah.schvimer siyuan.zhou

The intention of this ticket was to ensure the consistency of config.transactions after upgrade/downgrade regarding the startOpTime but after discussing with Siyuan, I believe there may actually be no work at all required for this ticket.

On upgrade: No work needed to be done as all 4.0 transactions in config.transactions would have been committed and therefore need no startOpTime on upgrade. Committed transactions do not have a startOpTime field as they are no longer active.

On downgrade: After talking to Siyuan, we convinced ourselves that it should be impossible for a 4.2 format transaction to have an inProgress state field in config.transactions when we are downgrading the transaction table. We grab and release the global lock in S mode here to ensure that we wait for all running transactions to have been committed/aborted before we downgrade the transaction table. Once the lock is released, only 4.0 format transactions are allowed as we gate the new oplog format on FCV < fullyUpgradedTo42. Therefore, I believe there is no work needed for this case either.

I originally wanted to add an invariant in downgradeTransactionTable as a sanity check to make sure no transactions are in the inProgress state, but the validation function here only checks for the in-memory state. I am willing to forego adding the invariant as to avoid the extra disk look-up in the validation step of forEachSessionWithCheckout.

Comment by Judah Schvimer [ 11/Apr/19 ]

SGTM.

Comment by Siyuan Zhou [ 11/Apr/19 ]

SGTM. When reading the code, my only question is about concurrency. We don’t check out the sessions, implying the concurrency isn’t a concern here. If that’s true, why do we need to double check its state?

Comment by Jason Chan [ 11/Apr/19 ]

siyuan.zhou judah.schvimer

I had some questions regarding the scope/work around this ticket. Judah pointed out that the problem with upgrade/downgrade with regards to the startOpTime (it's been changed from Timestamp to OpTime) is that on downgrade, lower versions won't recognize the startOpTime field and on upgrade to 4.2, the field won't exist in the table.

After investigating SERVER-36498 and SERVER-39680 I wanted to clarify some observations that I made:

1. In 4.0, only committed transactions will have records in the transaction table. With the current behavior from jesse's ticket, committed entries do not have a startOpTime field as the value is no longer necessary as the transaction is no longer active. This leads me to believe that no work may be necessary in the upgrade case.

2. Currently on downgrade to 4.0, all the transaction entries with a 'state' field will be removed from the config.transactions table. The only work here should be to simply extend the list of states here to include the inTxn state.

Does this implementation plan sound correct to you guys?

Generated at Thu Feb 08 04:53:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.