[SERVER-60953] TenantMigrationDonorService::getDurableState should return a future Created: 22/Oct/21 Updated: 29/Oct/23 Resolved: 15/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.2.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Matt Broadstone | Assignee: | Didier Nadeau |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Server Serverless 2021-11-15, Server Serverless 2021-11-29 | ||||||||
| Participants: | |||||||||
| Description |
|
This method has a side-effect of waiting for the fulfillment of a promise indicating the state document has been inserted and majority-committed. This pollutes the TMDS API by requiring an opCtx be passed in rather than letting the caller decide where to block on the future. We should consider returning a Future<DurableState> instead. |
| Comments |
| Comment by Githook User [ 15/Nov/21 ] | ||||||||
|
Author: {'name': 'Didier Nadeau', 'email': 'didier.nadeau@mongodb.com', 'username': 'nadeaudi'}Message: | ||||||||
| Comment by Esha Maharishi (Inactive) [ 05/Nov/21 ] | ||||||||
|
didier.nadeau, that sounds good to me. My understanding is the assert is "tenant id's are different, or durable state exists and migration is aborted and garbage collectable." (I think your comment here is missing the "durable state exists", but I saw you added this in the PR.) | ||||||||
| Comment by Didier Nadeau [ 05/Nov/21 ] | ||||||||
|
Following some reflection (and yesterday's discussion with matt.broadstone ) I believe the idea of an optional seems better :
The instance conflict logic does the following :
This will fail if :
In the case of a tenant migration that recently started (before a durable state exists), this would block until we get the first durable state (most likely kUninitialized or kAbortingIndexBuilds), and then it would fail (as the state is not kAborted). Presumably the caller would then try again later once the state is aborted. It seems we can simply cut the wait and fail right away if a durableState does not exist because, in that scenario, the instance is in kUninitialized state. In that case too the caller can try again later. Using a boost::optional seems logical as, at the beginning, it will simply be empty which allow us to fail right away. I also added a getInitialFuture that will be fulfilled at the same time that durableState gets its first value, allowing users to wait on the first durable state if needed. matt.broadstone esha.maharishi let me know if you think that wouldn't work ! | ||||||||
| Comment by Matt Broadstone [ 01/Nov/21 ] | ||||||||
|
After discussion we decided that we should move the _initialDonorStateDurablePromise waiting outside of the getDurableState method, and instead make a top-level future that callers can wait on. We'll need to make sure that we explicitly wait for the initial donor state in both the donorStartMigration and donorAbortMigration commands (at least). | ||||||||
| Comment by Esha Maharishi (Inactive) [ 26/Oct/21 ] | ||||||||
|
However, constructInstance should not wait on any promises, since it's called under the PrimaryOnlyService's mutex. This ticket could fix this by adding two variants of getDurableState: a{{ Future<DurableState> getDurableStateFuture}} and a boost::optional<DurableState> getDurableState. constructInstance could use the second and throw ConflictingOperationInProgress if it returns boost::none. |