Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-74466

Attach OperationKey in async_rpc for non hedged requests

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.0-rc0
    • Affects Version/s: None
    • Component/s: Internal Code
    • Labels:
      None
    • Service Arch
    • Fully Compatible
    • Service Arch 2023-03-20, Service Arch 2023-04-03, Service Arch 2023-04-17, Service Arch 2023-05-01, Service Arch 2023-05-15, Service Arch 2023-05-29, Service Arch 2023-06-12

      This is a split from SERVER-71764 that concerns cancellation for non-hedged operations. If the operation is non-hedged and canceled, async_rpc should send a killOperations to the target. SERVER-73016 addressed this for hedged operations.

      Notes from the original ticket:

      The RPC library should append an OperationKey (GUID) to all operations it sends. When those operations are canceled, if networking has begun for that operation (i.e. any data may have been sent), the RPC unconditionally sends fire-and-forget _killOperations for that OperationKey to the same remote node.

      Since the library currently delegates to TaskExecutor/NetworkInterfaceTL to perform networking, implementing this cancellation contract will first involve cancelling any ongoing network interface operations, if they have begun. This is possible by simply using the cancellation token passed to the network interface, and should happen automatically if the async_rpc API's user cancels the token they passed in. Then, the async_rpc API should inspect the NetworkInterface operation response and see if it succeeded or successfully cancelled; if it succeeded, the async_rpc layer needs to send killOperations itself.

      If the NetworkInterface operation was cancelled successfully, the network interface may or may not have sent the required _killOperations. Here, we should do a short/timeboxed investigation on the best solution - one thing we could do is fix the network interface to always send _killOperations in this case, if that fix is small and simple. If it is not, we could have the async_rpc api always send it, even if it ends up being a duplicate _killOperations.

       

      Update: For non hedged operations, async_rpc only deals with request and response, and does nothing to progress the operation past this layer. This means async_rpc would not need to do any extra work to deal with cancellation for non hedged operations. The new intention with the ticket is to still append an OperationKey to all operations and test the functionality in the async_rpc layer. Then, test that the OperationKey can be passed into the NetworkInterfaceTL layer.

      NetworkInterfaceTL already has cancellation and killOperation functionality, and the async_rpc layer has cancellation and killOperation functionality for hedged operations. End to end testing is not possible because there are no operations that currently use async_rpc, so we only test that the OperationKey can be accepted and propagated into each of the 2 layers.

            Assignee:
            alex.li@mongodb.com Alex Li
            Reporter:
            amirsaman.memaripour@mongodb.com Amirsaman Memaripour
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: