[SERVER-53538] Clarify semantics of when destructor runs for ExecutorFuture chains Created: 29/Dec/20  Updated: 01/Jun/22  Resolved: 02/May/22

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Matt Diener (Inactive)
Resolution: Done Votes: 0
Labels: servicearch-q1-2022, servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-52942 Evaluate whether we could change the ... Closed
Related
related to SERVER-54119 Destroy SharedStateBase as we unfold ... Closed
is related to SERVER-53108 Fix OperationContext lifetime managem... Closed
is related to SERVER-66126 Clear Future/ExecutorFuture continuat... Open
Backport Requested:
v5.0
Sprint: Service Arch 2021-02-22, Service Arch 2021-04-05, Service Arch 2021-04-19, Service Arch 2021-06-14, Service Arch 2021-06-28, Service Arch 2021-07-12, Service Arch 2021-08-09, Service Arch 2022-03-21, Service Arch 2022-04-04, Service Arch 2022-04-18, Service Arch 2022-05-02, Service Arch 2022-05-16
Participants:
Linked BF Score: 176
Story Points: 4

 Description   

While working on SERVER-53108, I had encountered an invariant failure about ServiceContext::_clients not being empty. After digging into it—and thanks to the numactl --physcpubind=0 trick—I determined that the reference count for the executor::TaskExecutor hadn't been decremented (from 2 to 1) despite there being no variables alive in the C++ code that could refer to it.

amirsaman.memaripour and I weren't sure if this behavior was expected or not; he asked me to file this ticket.

 

Acceptance Criteria:

Conclude and document the semantics for the destructor and recommendations for fixing it. 

Steps to reproduce

numactl --physcpubind=0 build/install/bin/util_test --filter=CheckIfGetWaitsForDestructorToRun

diff --git a/src/mongo/util/future_util_test.cpp b/src/mongo/util/future_util_test.cpp
index e9878d21e8..d35a45ef72 100644
--- a/src/mongo/util/future_util_test.cpp
+++ b/src/mongo/util/future_util_test.cpp
@@ -64,7 +64,7 @@ public:
         return _network;
     }
 
-private:
+protected:
     std::shared_ptr<executor::ThreadPoolTaskExecutor> _executor;
     executor::NetworkInterfaceMock* _network;
 };
@@ -85,6 +85,26 @@ private:
 
 using AsyncTryUntilTest = FutureUtilTest;
 
+// This test case doesn't actually use AsyncTry but AsyncTryUntilTest is a test fixture that I knew
+// had a ThreadPoolTaskExecutor.
+TEST_F(AsyncTryUntilTest, CheckIfGetWaitsForDestructorToRun) {
+    auto currentUseCount = _executor.use_count();
+    auto i = 0;
+    {
+        ExecutorFuture(_executor)
+            .then([&i, executor = _executor] {
+                (void)executor;
+                ++i;
+            })
+            // Uncommenting the following line causes the test to pass.
+            // .then([] {})
+            .get();
+    }
+
+    ASSERT_EQ(i, 1);
+    ASSERT_EQ(_executor.use_count(), currentUseCount);
+}
+
 TEST_F(AsyncTryUntilTest, LoopExecutesOnceWithAlwaysTrueCondition) {
     auto i = 0;
     auto resultFut = AsyncTry([&] { ++i; })



 Comments   
Comment by Matt Diener (Inactive) [ 02/May/22 ]

We are planning on moving to better semantics with SERVER-66126.

Current destructor semantics at a high level:

  Future is already ready when continuation function is called Future is not yet ready when continuation function is called{}
Future In a chain of continuations, the first call to a continuation function will run to completion before trying to set up the next continuation function. The continuation function itself is a temporary variable in the scope of the caller to then(), hence semantics depend on whether continuation chains cascade or whether they are separated by semicolons.  The first continuation function cleans itself up completely and a ready Future is returned to chain continuations from.|The callback function cannot be called because the Future is not ready. This means a not-ready Future is returned. At the time a Promise is emplaced, there’s a high likelihood that the whole chain is created.
In this case, the entire chain runs up to the point of the first non-ready Future-returning continuation before cleaning up any captures. When a Future-returning continuation runs, it’s possible that the chain is broken because the returned Future is not yet ready.
ExecutorFuture{} The callback runs immediately but that callback wraps the user-provided continuation function with some code that schedules it on an Executor.  Ownership of that user-provided callback is moved to the Executor and the destruction semantics are deferred to that Executor’s implementation details.|The callback runs once a Promise emplaces a value. Like in the ready-Future case, the callback itself wraps the user-provided function with some code that schedules it on an Executor.
Destructor ordering semantics are also deferred to the Executor’s implementation details.

Note: these semantics were not verified with all types of continuation functions. It is possible that different semantics apply between functions like `then`, `onError`, `getAsync`, etc.. From this investigation, we concluded that our goal is to move towards guaranteed semantics.

Generated at Thu Feb 08 05:31:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.