Loading...

XML

Word

Printable

JSON

Type: Spec Change
Resolution: Won't Do
Priority: Unknown
Fix Version/s: None
Component/s: Load Balancer, SDAM, Serverless Testing
Labels:
None

Driver Changes:
Not Needed

Summary

While investigating CLOUDP-104364, it was mentioned that the Atlas serverless proxy only outputs serviceId for the initial handshake, which is determined by the existence of a client field in the hello command.

This uncovered a subtle bug in libmongoc (~~CDRIVER-4207~~). Errors and timeouts encountered during monitoring would typically result in libmongoc constructing a handshake command when reconnecting; however, errors/timeouts encountered during application usage do not. In single-threaded SDAM, the monitoring and application sockets are one and the same.

Some suggestions for this ticket:

If not already noted in the SDAM spec, single-threaded implementations consider monitoring and application errors the same for purposes of sending a handshake hello during reconnection.
Consider noting the Atlas serverless proxy behavior in some drivers specification. I realize we don't have a serverless spec (just Serverless Testing), but perhaps this warrants a note in either the SDAM or Handshake specs.

Motivation

Who is the affected end user?

Single-threaded SDAM implementations.

How does this affect the end user?

Users connected to Atlas Serverless that encounter an application error/timeout might get blocked by client-side load balancer errors due to a missing serviceId in subsequent hello responses.

How likely is it that this problem or use case will occur?

This may occur after any application error/timeout in a single-threaded driver connected to Atlas Serverless.

If the problem does occur, what are the consequences and how severe are they?

Application will likely be blocked for the lifetime of the MongoClient.

Is this issue urgent?

The spec change itself is not urgent and does not block libmongoc from fixing its bug.

Is this ticket required by a downstream team?

No.

Is this ticket only for tests?

No, but it does affect serverless testing since our spec tests trigger socket errors via fail points (see: ~~PHPLIB-717~~).

is related to

CDRIVER-4207 mongoc_topology_scanner_node_t.last_failed ignores errors outside of monitoring for singled-threaded SDAM

Closed

PHPLIB-717 Test Serverless behind a load balancer to prevent test breakage

Closed

Assignee:: Unassigned
Reporter:: Jeremy Mikola
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Oct 29 2021 05:01:41 PM UTC
Updated:: Jan 08 2025 12:24:46 AM UTC
Resolved:: Nov 30 2021 07:45:30 PM UTC

Details

Description

Summary

Motivation

Who is the affected end user?

How does this affect the end user?

How likely is it that this problem or use case will occur?

If the problem does occur, what are the consequences and how severe are they?

Is this issue urgent?

Is this ticket required by a downstream team?

Is this ticket only for tests?

Attachments

Issue Links

Forms

Activity

People

Dates