[DRIVERS-2647] Drivers should unpin connections when ending a session Created: 09/Jun/23  Updated: 31/Jan/24

Status: Backlog
Project: Drivers
Component/s: Load Balancer, Transactions
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matt Dale Assignee: Qingyang Hu
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Issue split
split to CDRIVER-4756 Drivers should unpin connections when... Blocked
split to CSHARP-4829 Drivers should unpin connections when... Blocked
split to CXX-2779 Drivers should unpin connections when... Blocked
split to GODRIVER-3034 Drivers should unpin connections when... Blocked
split to MOTOR-1204 Drivers should unpin connections when... Blocked
split to NODE-5722 Drivers should unpin connections when... Blocked
split to PHPLIB-1299 Drivers should unpin connections when... Blocked
split to PYTHON-4020 Drivers should unpin connections when... Blocked
split to RUBY-3343 Drivers should unpin connections when... Blocked
split to RUST-1790 Drivers should unpin connections when... Blocked
split to JAVA-5228 Drivers should unpin connections when... Blocked
Related
is related to GODRIVER-2867 Connection leak caused by connection... Closed
Driver Changes: Needed
Quarter: FY25Q1
Downstream Changes Summary:

Summary of necessary driver changes

  •  

Commits for syncing spec/prose tests
(and/or refer to an existing language POC if needed)

  •  

Context for other referenced/linked tickets

  •  
Engineering Lead: Jeffrey Yemin Jeffrey Yemin
Start date:
Driver Compliance:
Key Status/Resolution FixVersion
CDRIVER-4756 Blocked
CXX-2779 Blocked
CSHARP-4829 Blocked
GODRIVER-3034 Blocked
JAVA-5228 Blocked
NODE-5722 Blocked
MOTOR-1204 Blocked
PYTHON-4020 Blocked
PHPLIB-1299 Blocked
RUBY-3343 Blocked
RUST-1790 Blocked

 Description   

Summary

The Transactions and Load Balancer specs together describe the process for "pinning" a connection to a ClientSession when connected to a load-balancer so that all transaction operations are sent to the same mongos. The Load Balancers spec also requires that drivers not use the pinned connection for other transactions concurrently

Drivers MUST NOT use the same connection for two concurrent transactions run under different sessions from the same client.

which drivers may implement by giving exclusive access to the pinned connection to the ClientSession.

The When to unpin section in the Transactions spec enumerates the conditions when a connection should be unpinned from a ClientSession. However, the list doesn't specify unpinning when a ClientSession is explicitly ended with endSession.

According to the When to unpin section, the following sequence of events may never lead to a connection being unpinned from a ClientSession, effectively creating a connection "leak":

  1. Start a ClientSession.
  2. Run startTransaction in the session.
  3. Run insertMany in the transaction.
  4. Run commitTransaction.
  5. Run endSession.

We should amend the When to unpin section to explicitly require drivers to unpin connections from a ClientSession when endSession is called.

Motivation

Who is the affected end user?

Users running transactions on load-balanced databases (e.g. Atlas Serverless or dedicated clusters connecting via a VPC private link).

How does this affect the end user?

Connections used to run transactions may never be returned to the connection pool, resulting in a connection leak. If the driver is configured with a max connection pool size (or using the default), the driver may eventually run out of unpinned connections and hang.

How likely is it that this problem or use case will occur?

If a driver unpins connections strictly as described in When to unpin in the Transactions spec, a connection leak will happen anytime a customer creates a session, runs a transaction, and ends the session while connected to a load-balanced database.

Some drivers already unpin connections when ending sessions and are unaffected.

If the problem does occur, what are the consequences and how severe are they?

A driver will either open and then then not use many extra connections or it will hang and stop working.

Is this issue urgent?

The issue in the Go driver (GODRIVER-2867) is moderately urgent because it is the root cause of a mongosync bug that is blocking a customer migration (HELP-46291). Its urgency depends on how many other drivers are impacted by the problem and the rate of adoption of impacted services, like Serverless. Overall, it seems moderately urgent to figure out how many drivers are impacted.

Is this ticket required by a downstream team?

No.

Is this ticket only for tests?

No.

Acceptance Criteria

  • Amend the "When to unpin" section of the Transactions spec to explicitly require drivers to unpin connections from a ClientSession when endSession is called.
  • Add a load balancer spec test that asserts that pinned connections are returned to the connection pool when endSession is called.


 Comments   
Comment by George Wangensteen [ 19/Jul/23 ]

I discussed this with max.hirschhorn@mongodb.com and we think that it is important that drivers also require unpinning after commitTransaction succeeds. If an application mints new sessions explicitly to run transactions, but does not explicitly end those sessions (relying on i.e. garbage collection), connections could also be 'leaked' / not unpinned for prolonged periods of time.

Generated at Thu Feb 08 08:26:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.