[SERVER-54661] Snapshotting during tenant migration block will fail all pending transactional reads Created: 19/Feb/21  Updated: 29/Oct/23  Resolved: 01/Mar/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0

Type: Bug Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: Andrew Shuvalov (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-54828 Optimization: tenant migration blocke... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:
Linked BF Score: 12

 Description   

This is from BF-20193. The test is failing because a transactional read was blocked by tenant migration blocker and when if was unblocked the oldest timestamp was already moved forward by the wired tiger, thus read had to fail after migration was aborted. So this really happens by design.

Here the read fail:

[js_test:tenant_migration_concurrent_reads] 2021-02-13T14:11:25.265+0000 "errmsg" : "Read timestamp Timestamp(1613225479, 1) is older than the oldest available timestamp.",

here is the evidence that WT moved the oldest timestamp:

[js_test:tenant_migration_concurrent_reads] 2021-02-13T14:11:26.358+0000 d21522| 2021-02-13T14:11:26.357+00:00 D2 RECOVERY 23988 [SignalHandler] "Shutdown timestamps.","attr":{"Stable Timestamp":{"$timestamp":{"t":1613225485,"i":2}},"Initial Data Timestamp":{"$timestamp":{"t":1613225407,"i":1}},"Oldest Timestamp":{"$timestamp":

{"t":1613225480,"i":2}

}}

My opinion we should simply fix the tenant_migration_concurrent_reads.js test here to allow the TransientTransactionError with oldest timestamp message. After all, we will fail the read anyway if the tenant migration will succeed (common case).

I'm leaving this as a new bug for someone to agree or disagree with my conclusion. If we keep it as a normal condition, the fix is 3 lines in the test.

To some extend my question is about migration design - does it make sense to block all reads for undetermined time if they will likely fail anyway? I understand that the user won't have much option than to retry anyway. But the retry may complete faster because if the new read will have newer timestamp it will not block.



 Comments   
Comment by Githook User [ 01/Mar/21 ]

Author:

{'name': 'Andrew Shuvalov', 'email': 'andrew.shuvalov@mongodb.com', 'username': 'shuvalov-mdb'}

Message: SERVER-54661: prevent transactional read in test to fail on snapshotting
Branch: master
https://github.com/mongodb/mongo/commit/736d29af863ce061a5d3d2759f9d0b72075dcf09

Comment by Andrew Shuvalov (Inactive) [ 26/Feb/21 ]

Filed SERVER-54828 as suggested optimization related to this error.

Generated at Thu Feb 08 05:34:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.