[SERVER-53592] Investigate SERVER-52750 Created: 05/Jan/21  Updated: 25/Feb/22  Resolved: 25/Feb/22

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Tyler Seip (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-54932 Modify BaseCloner to accept a cancell... Backlog
depends on SERVER-54933 AbstractAsyncComponent should support... Backlog
depends on SERVER-54934 Modify ReshardingCoordinatorService t... Backlog
depends on SERVER-54943 Modify Resharding{Recipient|Donor}Ser... Backlog
depends on SERVER-54931 AsyncRequestsSender should be modifie... Closed
depends on SERVER-53389 TenantMigration{Donor, Recipient}Serv... Backlog
depends on SERVER-53931 Investigate how to cancel recipients ... Closed
Sprint: Service Arch 2021-02-22, Service Arch 2021-03-08
Participants:
Story Points: 2

 Description   

This ticket is to investigate SERVER-52750 to see whether there are certain PrimaryOnlyServices that rely on the ScopedTaskExecutor shutdown for cleanup rather than using CancelationTokens, and if so, to create tickets to modify that functionality to use CancelationTokens instead so that we can do SERVER-52750.
 



 Comments   
Comment by Tyler Seip (Inactive) [ 04/Mar/21 ]

All relevant tickets have been spawned as a result of this investigation, closing.

Comment by Tyler Seip (Inactive) [ 04/Mar/21 ]

max.hirschhorn points out that I've accidentally omitted two more PrimaryOnlyServices (thanks Max!): ReshardingDonorService and ReshardingRecipientService.

Per file, the functions that need to be modified to support cancellation are:
ReshardingDonorService

// These can all be modified in place by plumbing in cancellation tokens at the call site.
_awaitAllRecipientsDoneCloningThenTransitionToDonatingOplogEntries
_awaitAllRecipientsDoneApplyingThenTransitionToPreparingToBlockWrites
_awaitCoordinatorHasDecisionPersistedThenTransitionToDropping

ReshardingRecipientService

// These can all be modified in place by plumbing in cancellation tokens at the call site.
_awaitAllDonorsPreparedToDonateThenTransitionToCreatingCollection
_awaitCoordinatorHasDecisionPersistedThenTransitionToRenaming
 
// This requires both in-place modification as above and changes to ReshardOplogApplier, which are outlined in SERVER-53931
_awaitAllDonorsBlockingWritesThenTransitionToStrictConsistency

Comment by Tyler Seip (Inactive) [ 04/Feb/21 ]

Of the three extant PrimaryOnlyServices (ReshardingCoordinatorService, TenantMigrationDonorService, and TenantMigrationRecipientService), it appears that only TenantMigrationDonorService has been adapted to use its cancellation tokens, though it appears this work is not yet complete. We have a planned mechanism of making cancellable executors in https://jira.mongodb.org/browse/SERVER-53326, and extending this to cover the additional context that a (Scoped)TaskExecutor covers should be pretty straightforward.

Per file, the functions that need to be modified to support cancellation are:
ReshardingCoordinatorService

// The following require plumbing cancellation tokens into sharding_util::sendCommandToShards and ultimately into the AsyncRequestsSender, which needs to be modified to support the cancellable future interface instead of the old callback interface.
_tellAllParticipantsToRefresh
_tellAllDonorsToRefresh
_tellAllRecipientsToRefresh
 
// The following can be modified in place to take the cancel token we have access to at their call sites.
_awaitAllDonorsReadyToDonate
_awaitAllRecipientsFinishedCloning
_awaitAllRecipientsFinishedApplying
_awaitAllParticipantShardsRenamedOrDroppedOriginalCollection

TenantMigrationDonorService

// This function takes an executor but doesn't use it, so it technically doesn't need to be modified. Why does it take an executor?
tenant_migration_util::storeExternalClusterTimeKeyDocs
// This function needs to be modified, and that work is already planned in SERVER-53389.
_waitForMajorityWriteConcern

TenantMigrationRecipientService

// The following can be modified in place to take the top level cancellation token we have access to at their call sites.
_createAndConnectClients
_initializeStateDoc
_onCloneSuccess
_getDataConsistentFuture
_markStateDocAsGarbageCollectable
_updateStateDocForMajority
 
// This function requires a modification of BaseCloner to accept a cancellation token in its runOnExecutorEvent method.
_startTenantAllDatabaseCloner
 
// The following require modifications to AbstractAsyncComponent to support taking in a cancellation token.
_startOplogFetcher
TenantOplogApplier (in run())

Generated at Thu Feb 08 05:31:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.