[SERVER-54828] Optimization: tenant migration blocker should not block reads for too long Created: 26/Feb/21 Updated: 27/Oct/23 Resolved: 09/Sep/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Andrew Shuvalov (Inactive) | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
The current code is not efficient for both us and users. For the users, a blocked transactional read will likely eventually fail anyway because the migration will more likely to succeed than abort. Moreover, if the user repeats the read a bit later it may actually succeed because its cluster time will move forward. For us this is not efficient because we can block too many reads and face thundering herd when the migration is over. We can actually try to process what we can in the meantime. The solution is to fail the read after some fixed timeout (like ~1 second) with some error code that is not automatically retried and will trigger re-doing the transaction. 1 second should be enough to have sufficient backoff for us to not overload the server with retries, and short enough to not make users to complain. |
| Comments |
| Comment by A. Jesse Jiryu Davis [ 09/Sep/21 ] |
|
Thanks! |
| Comment by Andrew Shuvalov (Inactive) [ 09/Sep/21 ] |
|
Good point, I agree. Perhaps this ticket can be closed, I was trying to be proactive to not miss a possible deficiency. |
| Comment by A. Jesse Jiryu Davis [ 09/Sep/21 ] |
|
"From the user experience point of view - failing the read faster is the preferred outcome here." If the read fails faster on the donor, the client will immediately retry on the donor, and fail again, in a loop until the migration commits. Then the next read will be directed to the recipient (by Atlas routing) and succeed. In the current design (IIUC) a read on the donor blocks until the migration commits, then fails once. The client retries the read, which Atlas directs to the recipient, where it succeeds. I think this minimizes latency, because the client retries at precisely the right moment, instead of retrying periodically in a loop. It also minimizes load from useless retries during migration. Does that sound right to you or am I missing something? |
| Comment by Andrew Shuvalov (Inactive) [ 09/Sep/21 ] |
|
jesse first of all good point that 120 sec timeout is very likely good enough and there is nothing to do here from the point of view of thundering herd because the pending reads will not retry all at once. And yes, your description is correct. There is a second problem to consider - if the migration is already ongoing for several seconds it is very likely that it will succeed and the blocked read will fail anyway, and will have to be retried. From the user experience point of view - failing the read faster is the preferred outcome here. It makes sense waiting only if there is a considerable chance that the read will succeed, and soon. This should be considered as optimization to improve user experience, the existing code is correct. |
| Comment by A. Jesse Jiryu Davis [ 08/Sep/21 ] |
|
andrew.shuvalov can you explain in some more detail what the problem is, and how your proposed solution is better? Is the problem that the thundering herd at the end of migration is overwhelming on the recipient? Is the intention of your solution to temporally spread out the load on the recipient? When you say, "we can actually try to process what we can in the meantime", do you mean reads with timestamps before the migration's "block timestamp"? |
| Comment by A. Jesse Jiryu Davis [ 29/Jul/21 ] |
|
The non-configurable 120 seconds is standard for all drivers, see https://github.com/mongodb/specifications/blob/master/source/transactions-convenient-api/transactions-convenient-api.rst When timeoutMS is configured it supersedes the 120 seconds: DRIVERS-555. |
| Comment by Esha Maharishi (Inactive) [ 01/Mar/21 ] |
|
Note that drivers only retry even TransientTransactionError errors up to a certain limit, e.g. the PHP driver retries for a (non-configurable) 120 seconds. |