SubBaton waitUntil not canceled by shutdown

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Server Programmability
    • ALL
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      SubBaton::waitUntil checks if the subbaton is shut down before registering a timer on the actual baton, but it doesn't cancel the timer when it is shutting down or even when executing the callbacks once the timer succeeds. This can easily lead to situations where callbacks attached to a subbaton are executed long after the subbaton itself has been destroyed, likely along with other state the callbacks might interact with.

      We should update SubBaton to cancel any outstanding waitUntil futures when it is shutdown. We could achieve this by introducing an intermediary promise/future that we actually return, which will be fulfilled either by the main baton when the timer fires or early in shutdown. Alternatively (or in addition to), we could leverage cancellation tokens in shutdown to interrupt the wait. It might be tricky to ensure the cancellation has been "caught" and the timer actually interrupted with a CancellationToken only though, since that relies on scheduling more work on the baton.

      Repro:

      TEST_F(EgressAsioNetworkingBatonTest, WaitUntilSubBaton) {
          EgressSessionWithScopedReactor es(getGlobalServiceContext());
      
          Notification<void> ready;
          auto opCtx = client().makeOperationContext();
          auto baton = opCtx->getBaton();
      
          auto subBaton = baton->makeSubBaton();
          auto fut = subBaton->waitUntil(Date_t::now() + Seconds(3), CancellationToken::uncancelable())
                         .onCompletion([&](Status s) { LOGV2(123123, "in sub baton wait"); });
      
          // Sleep for less than the timer;
          LOGV2(123123, "Sleeping for 1s");
          opCtx->sleepFor(Seconds(1));
      
          // *should* cancel waitUntil
          LOGV2(123123, "Shutting down subbaton");
          subBaton.shutdown();
      
          // Sleep long enough to fire timer.
          LOGV2(123123, "Sleeping for 5s");
          opCtx->sleepFor(Seconds(5));
      
          // Detatch the main baton.
          LOGV2(123123, "Detaching main baton");
          opCtx.reset();
      }
       

      Output:

      {"t":{"$date":"2025-12-22T22:03:09.122+00:00"},"s":"I",  "c":"TEST",     "id":123123,  "ctx":"main","msg":"Sleeping for 1s"}
      {"t":{"$date":"2025-12-22T22:03:10.122+00:00"},"s":"I",  "c":"TEST",     "id":123123,  "ctx":"main","msg":"Shutting down subbaton"}
      {"t":{"$date":"2025-12-22T22:03:10.122+00:00"},"s":"I",  "c":"TEST",     "id":123123,  "ctx":"main","msg":"Sleeping for 5s"}
      {"t":{"$date":"2025-12-22T22:03:12.122+00:00"},"s":"I",  "c":"TEST",     "id":123123,  "ctx":"main","msg":"in sub baton wait"}
      {"t":{"$date":"2025-12-22T22:03:15.122+00:00"},"s":"I",  "c":"TEST",     "id":123123,  "ctx":"main","msg":"Detaching main baton"}
      

            Assignee:
            Unassigned
            Reporter:
            Patrick Freed
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: