[SERVER-54828] Optimization: tenant migration blocker should not block reads for too long Created: 26/Feb/21  Updated: 27/Oct/23  Resolved: 09/Sep/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: A. Jesse Jiryu Davis
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-54661 Snapshotting during tenant migration ... Closed
Participants:

 Description   

The current code is not efficient for both us and users. For the users, a blocked transactional read will likely eventually fail anyway because the migration will more likely to succeed than abort. Moreover, if the user repeats the read a bit later it may actually succeed because its cluster time will move forward.

For us this is not efficient because we can block too many reads and face thundering herd when the migration is over. We can actually try to process what we can in the meantime.

The solution is to fail the read after some fixed timeout (like ~1 second) with some error code that is not automatically retried and will trigger re-doing the transaction. 1 second should be enough to have sufficient backoff for us to not overload the server with retries, and short enough to not make users to complain.



 Comments   
Comment by A. Jesse Jiryu Davis [ 09/Sep/21 ]

Thanks!

Comment by Andrew Shuvalov (Inactive) [ 09/Sep/21 ]

Good point, I agree. Perhaps this ticket can be closed, I was trying to be proactive to not miss a possible deficiency.

Comment by A. Jesse Jiryu Davis [ 09/Sep/21 ]

"From the user experience point of view - failing the read faster is the preferred outcome here." If the read fails faster on the donor, the client will immediately retry on the donor, and fail again, in a loop until the migration commits. Then the next read will be directed to the recipient (by Atlas routing) and succeed.

In the current design (IIUC) a read on the donor blocks until the migration commits, then fails once. The client retries the read, which Atlas directs to the recipient, where it succeeds. I think this minimizes latency, because the client retries at precisely the right moment, instead of retrying periodically in a loop. It also minimizes load from useless retries during migration.

Does that sound right to you or am I missing something?

Comment by Andrew Shuvalov (Inactive) [ 09/Sep/21 ]

jesse first of all good point that 120 sec timeout is very likely good enough and there is nothing to do here from the point of view of thundering herd because the pending reads will not retry all at once. And yes, your description is correct. 

There is a second problem to consider - if the migration is already ongoing for several seconds it is very likely that it will succeed and the blocked read will fail anyway, and will have to be retried. From the user experience point of view - failing the read faster is the preferred outcome here. It makes sense waiting only if there is a considerable chance that the read will succeed, and soon.

This should be considered as optimization to improve user experience, the existing code is correct.

Comment by A. Jesse Jiryu Davis [ 08/Sep/21 ]

andrew.shuvalov can you explain in some more detail what the problem is, and how your proposed solution is better? Is the problem that the thundering herd at the end of migration is overwhelming on the recipient? Is the intention of your solution to temporally spread out the load on the recipient? When you say, "we can actually try to process what we can in the meantime", do you mean reads with timestamps before the migration's "block timestamp"?

Comment by A. Jesse Jiryu Davis [ 29/Jul/21 ]

The non-configurable 120 seconds is standard for all drivers, see https://github.com/mongodb/specifications/blob/master/source/transactions-convenient-api/transactions-convenient-api.rst

When timeoutMS is configured it supersedes the 120 seconds: DRIVERS-555.

Comment by Esha Maharishi (Inactive) [ 01/Mar/21 ]

Note that drivers only retry even TransientTransactionError errors up to a certain limit, e.g. the PHP driver retries for a (non-configurable) 120 seconds.

Generated at Thu Feb 08 05:34:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.