-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Cluster Scalability
-
Cluster Scalability Priorities
-
2
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Spans are an RAII type that create an OTEL span for the duration of that object's lifetime. In resharding, we create these in the bodies of futures continuations.
This leads to these spans being fairly useless. For example, this span is named "ReshardingCoordinator::_awaitAllRecipientsFinishedApplying," which implies that it will represent the entire duration that the coordinator is awaiting all recipients to finish applying. However, this span goes out of scope as soon as we return from this lambda, which will happen as soon as we return a future in _awaitAllRecipientsFinishedApplying, either here or here. This will therefore not cover the period of time that the actual waiting is done, since that waiting is done during continuations on the future returned from the top level lambda, which are scheduled on the executor after the span has already gone out of scope.
We need to reconsider the lifetimes of these spans so they can actually cover the period of time that they claim to cover.
The most obvious way to do this is to make the current span a member variable of the coordinator instance, and either reset or replace that span when it ends and a new span begins.
The limitation of the above suggestion is that we don't have a good way to have overlapping spans (for example, in the case that we previously had a single top level umbrella span covering additional child spans representing that umbrella span's individual operations, or in the case that we had multiple operations running concurrently, each with their own spans). For now, I think it's fine if we simplify the reported spans such that only one span is active at a time (e.g. representing the coordinator's phase). We can reevaluate this once we have coroutines and can implement this is a more natural way.
- related to
-
SERVER-121114 Fix span declarations in resharding recipient
-
- Backlog
-
-
SERVER-121115 Fix span declarations in resharding donor
-
- Backlog
-