[JAVA-4242] Introduce Request Context support for API-agnostic context propagation Created: 19/Jul/21  Updated: 28/Oct/23  Resolved: 28/Sep/21

Status: Closed
Project: Java Driver
Component/s: Monitoring
Affects Version/s: None
Fix Version/s: 4.4.0

Type: New Feature Priority: Major - P3
Reporter: Mark Paluch Assignee: Jeffrey Yemin
Resolution: Fixed Votes: 0
Labels: external-user, rp-toSched
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
Documentation Changes: Needed

 Description   

A MongoDB driver can be used from various programming models (imperative where calls remain on the same thread, reactive where code is thread-agnostic). At the same time, the underlying application using the MongoDB driver can be interested in end-to-end tracing of outgoing remote calls. To properly construct traces, each participating component must be able to contribute their individual activity in the form of a span (one or more spans represent a full trace) and be able to attach its span to the overall trace.

In imperative programming arrangements, context propagation is typically implemented as out-of-band storage where the context is attached to the thread that is associated with the call. Context information is stored outside and integration components can associate context data and tracing identifiers through CommandListener.
Since CommandListener accept CommandEvent that are not associated with a context (or context provider), other types of programming models (asynchronous, event-loop, reactive) cannot leverage the ThreadLocal pattern.

The following proposal outlines a context capturing and propagation API to enable the driver to capture and propagate request contexts regardless of the underlying programming model. It consists of the following components:

  1. Generic context capturing interfaces to capture the current (inbound) context
  2. Context API
  3. Context consumption API

Context capturing

Calls to the driver are subject can happen within a potentially existing context. Depending on the called MongoDB API, context can be either stored ThreadLocal or can be passed as Reactor Context. Since the driver configuration infrastructure (MongoClientSettings) is API-agnostict, context providers should not depend on dependencies used in particular driver implementations. Instead, the driver internally can detect whether a context provider component was registered that can be leveraged to obtain the context:

// marker interface
interface ContextProvider {
 
}
 
interface EnvironmentContextProvider extends ContextProvider {
  RequestContext get(); // typically fetches data from a ThreadLocal
}
 
interface ReactorContextProvider extends ContextProvider {
  RequestContext get(ContextView reactorContext); // extracts a context object from the Reactor ContextView
}
 
 
class MixedContextProvider implements EnvironmentContextProvider, ReactorContextProvider {
 
  private final Tracer tracer = …;
 
  public RequestContext get() {
		Span span = tracer.getCurrentSpan();
		RequestContext ctx = …;
		ctx.put("span", span);
		return ctx;
	}
	
  public RequestContext get(ContextView view) {
		Span span = view.get(Span.class);
		RequestContext ctx = …;
		ctx.put("span", span);
		return ctx;
	}
} 
 
MongoClientSettings.builder().contextProvider(new MixedContextProvider(…)).build();

Context API

The context API is a mutable, Map-like data structure that allows associating key-value tuples with the current request. In a tracing setup, beginning a request would create a new span, store it in the request context, and upon success, error, retry, the span would be retrieved from the context and completed with the command outcome.

// marker interface
interface RequestContext // optional: extends Map<Object, Object> {
 
	Object get(Object key);
	
	// optional: <T> T get(Object key, Class<T> requiredType);
  
	Object put(Object key, Object o);
	
	Object remove(Object key);
	
	// optional: <K, T> T computeIfAbsent(K key, Function<K, T> factory, Class<T> requiredType);
 
}

Context consumption

The context would be primarily consumed through the command listener API CommandListener and ideally, RequestContext is provided through CommandEvent. An example implementation would be:

class TracingCommandListener implements CommandListener {
 
  private final Tracer tracer = …;
	
	public void commandStarted(CommandStartedEvent event) {
          Span parent = event.getContext().get("span", Span.class);
	  Span span = tracer.createSpan(event.getCommandName(), parent);
          span.spart();
          event.getContext().put("span", span);
	}
	
	public void commandSucceeded(CommandSucceededEvent event) {
          Span span = event.getContext().get("span", Span.class);
          span.stop();
          event.getContext().remove("span");
	}
} 



 Comments   
Comment by Githook User [ 28/Sep/21 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}

Message: Use correct MongoClientSettings in ContextProviderTest

JAVA-4242
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/580519d56d649bbd47a1a1385e04bac2318ed1bc

Comment by Githook User [ 28/Sep/21 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}

Message: Introduce RequestContext support for API-agnostic context propagation (#764)

  • Add RequestContext interface
  • Add ContextProvider to MongoClientSettings. Support SynchronousContextProvider for synchronous MongoClient and ReactiveContextProvider for reactive MongoClient
  • Add RequestContext to CommandEvent

JAVA-4242
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/eb5d16db039a50d3bf042eac05b154e1473e7cc2

Comment by Mark Paluch [ 26/Jul/21 ]

Thanks for raising the concern. Typically, there would be a parent span in the context that is used to create its children and children would be held separately. The code above should help to illustrate the calling side and is by no means exhaustive.

Comment by Jeffrey Yemin [ 23/Jul/21 ]

mpaluch@paluch.biz one other issue: the CommandListener implementation in the description seems problematic. In the commandSucceeded method, "span" is removed from the request context, but I think that will cause problems if the Context is used more than once for a request (say to fully iterate a cursor). It seems like it needs to be able to push the parent span back onto the request context, so that it's available for the next event. Maybe span needs to be a Stack<Span> instead?

Comment by Mark Paluch [ 23/Jul/21 ]

That makes sense. I initially hat Subscriber on my mind but then I went ahead with ContextView. Using Subscriber is the more flexible approach as a custom subscriber might provide custom means to transport contextual data.

Comment by Jeffrey Yemin [ 23/Jul/21 ]

mpaluch@paluch.biz although we now have a dependency on Project Reactor for our implementation of the reactive streams driver, I'd like to avoid taking a dependency on it in our public API, which would be required by 

interface ReactorContextProvider extends ContextProvider {
  RequestContext get(ContextView reactorContext); // extracts a context object from the Reactor ContextView
}

I wonder if it would work to instead define the interface like:

interface ReactorContextProvider extends ContextProvider {
  RequestContext get(Subscriber subsciber); // extracts a context object from the Subscriber
}

Then an implementation of this interface can extract the Context from the Subscriber, e.g for ProjectReactor:

Context context = s instanceof CoreSubscriber ? ((CoreSubscriber) s).currentContext() : Context.empty();
...

and return a RequestContext derived from the Context. Let me know if you think something like this would work. 

Comment by Mark Paluch [ 21/Jul/21 ]

All good, happy to elaborate.

  1. Casting/instanceof: Yes, the driver should try on a best-effort basis to determine whether the ContextProvider can be used when operating on a particular API. The line of thought is to not introduce static paths to method signatures that would require the presence of Project Reactor if someone is just using the synchronous driver. I'm not sure what's the best approach in case the provided ContextProvider isn't usable (e.g. assume a reactive ContextProvider as the application is primarily used reactively, later on someone decides to invoke a synchronous driver call). Maybe for the beginning emitting a log event would be good instead of terminating the call with an exception.
  2. ContextView can be obtained from Reactor's CoreSubscriber, see CoreSubscriber.currentContext. The context is made available upon subscription or through e.g Mono.deferWithContext(…).
  3. For Spring-based applications, you can anticipate a singleton for Tracer.
  4. The setup/wiring would be provided either through Spring Boot directly or through Spring Observability. Right now, the instrumentation code can be found here.

Let me know whether that helps. Happy to discuss further aspects.

Comment by Jeffrey Yemin [ 20/Jul/21 ]

Hi mpaluch@paluch.biz, thanks for opening this up.   I have a few questions about the proposed API that perhaps you or someone else at Spring could ponder:

  1. Do you anticipate that the driver does an instanceof check on the ContextProvider to determine if, for the synchronous driver, it's an instance of EnvironmentContextProvider, and for the reactive driver, an instance of ReactorContextProvider, and throw an exception otherwise?
  2. For the Reactive case, from where does the driver obtain an instance of ContextView with which to call ReactorContextProvider#get
  3. Is an instance of Tracer basically a singleton in the application?  It seems that way given that the provider/command listener stores an instance of Tracer as a field.
  4. In a Spring application, who is responsible for adding the ContextProvider and CommandListener to MongoClientSettings?  Would that be the responsibility of the application, or would Spring Data MongoDB provide some sort of bytecode manipulation that automatically add it at run time(I think OpenTelemetry and APM vendors take that approach)?

Apologies in advance for any misunderstanding I may have about this proposal.

Generated at Thu Feb 08 09:01:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.