[DRIVERS-2503] ConnectionId returned in heartbeats may be int64 Created: 17/Nov/22  Updated: 19/Apr/23

Status: Implementing
Project: Drivers
Component/s: None
Fix Version/s: None

Type: Bug Priority: Unknown
Reporter: James Kovacs Assignee: James Kovacs
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Issue split
split to RUBY-3206 ConnectionId returned in heartbeats m... Backlog
split to CSHARP-4483 ConnectionId returned in heartbeats m... Closed
split to CXX-2638 ConnectionId returned in heartbeats m... Closed
split to GODRIVER-2737 ConnectionId returned in heartbeats m... Closed
split to MOTOR-1084 ConnectionId returned in heartbeats m... Closed
split to NODE-4971 ConnectionId returned in heartbeats m... Closed
split to PHPC-2220 ConnectionId returned in heartbeats m... Closed
split to PYTHON-3571 ConnectionId returned in heartbeats m... Closed
split to RUST-1571 ConnectionId returned in heartbeats m... Closed
split to JAVA-4846 ConnectionId returned in heartbeats m... Closed
split to CDRIVER-4557 ConnectionId returned in heartbeats m... Closed
Problem/Incident
causes CSHARP-4417 Int overflow in connectionId Closed
Related
is related to CDRIVER-4502 libmongoc expects connectionId in hel... Closed
is related to SERVER-75293 Different return types for the connec... Closed
Driver Changes: Needed
Quarter: FY24Q1
Downstream Changes Summary:

The connectionId in the hello (or legacy hello) response can be an int32, double, or int64. Many drivers assume an int32, which may result in connectionId truncation or connection failure. Drivers should ensure that the server's connectionId (and the client connectionId for consistency) is expressed as a numeric type capable of holding an int64.

NOTE: If the client and server connectionId fields are part of the driver's public API, you may have to add new int64 connectionId fields and deprecate the existing int32 fields. On the next major version bump, the deprecated int32 fields should be removed.

Case:
Program Manager: Esha Bhargava Esha Bhargava
Start date:
Driver Compliance:
Key Status/Resolution FixVersion
CDRIVER-4557 Fixed 1.24.0
CXX-2638 Works as Designed
CSHARP-4483 Fixed 2.20.0
GODRIVER-2737 Fixed 1.11.7
JAVA-4846 Done 5.0.0
NODE-4971 Duplicate 6.4.0
MOTOR-1084 Duplicate
PYTHON-3571 Works as Designed
PHPC-2220 Fixed 1.16.0
RUBY-3206 Backlog
RUST-1571 Fixed 2.5.0
SWIFT-1692 Won't Do

 Description   

Summary

In the hello response, the server will return connectionId as an int32, int64, or even double. Many drivers (and our specs) assume that it is an int32. This can result in connection failures in some drivers (e.g. .NET/C#) or truncation of the connectionId (Java).

Motivation

Who is the affected end user?

End users with long-running clusters and high connection churn.

How does this affect the end user?

Depends on how the driver handles the overflow. Some will throw an exception. Others will truncate the connectionId. Worst case scenario, the user will be unable to connect to the MongoDB cluster.

Note that some drivers like Node.js and Python aren't affected by this bug because they use arbitrary-width numeric types.

How likely is it that this problem or use case will occur?

This issue is relatively infrequent. To overflow an int32 connectionId, the server would have to churn 100 connections per second for 8 months at a sustained rate. If the churn rate was higher, then the time to overflow would be proportionately shorter. This is mitigated by the fact that the server's connectionId counter is reset with every server restart.

If the problem does occur, what are the consequences and how severe are they?

Some drivers (like .NET/C#) won't be able to connect until the affected server is restarted. Others like Java will simply truncate the connectionId making it difficult/impossible to correlate client and server logs. Others like Python and Node.js are unaffected by this bug.

Is this issue urgent?

Given that restarting the affected server resolves the issue for multiple months even at high connection churn rates, this issue does not appear to be urgent at this time.

Is this ticket required by a downstream team?

No.

Is this ticket only for tests?

No.
Does this ticket have any functional impact, or is it just test improvements?



 Comments   
Comment by Githook User [ 19/Apr/23 ]

Author:

{'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}

Message: DRIVERS-2503: Require connectionId and serverConnectionId to be int64. (#1392)
Branch: master
https://github.com/mongodb/specifications/commit/1f272b30a2e9714578b71a02b2836d5a0bd853f8

Comment by James Kovacs [ 29/Mar/23 ]

Tracked down the code for appendNumber(StringData fieldName, long long llNumber). In MongoDB 4.4 and earlier the connectionId (a 64-bit integer) can be rendered as an int32, double, or int64:

    BSONObjBuilder& appendNumber(StringData fieldName, long long llNumber) {
        static const long long maxInt = (1LL << 30);
        static const long long minInt = -maxInt;
        static const long long maxDouble = (1LL << 40);
        static const long long minDouble = -maxDouble;
 
        if (minInt < llNumber && llNumber < maxInt) {
            append(fieldName, static_cast<int>(llNumber));
        } else if (minDouble < llNumber && llNumber < maxDouble) {
            append(fieldName, static_cast<double>(llNumber));
        } else {
            append(fieldName, llNumber);
        }
 
        return *this;
    }

https://github.com/mongodb/mongo/blob/v4.4/src/mongo/bson/bsonobjbuilder.h#L317-L332

In MongoDB 5.0, double was removed as a possible type:

    Derived& appendNumber(StringData fieldName, long long llNumber) {
        static const long long maxInt = std::numeric_limits<int>::max();
        static const long long minInt = std::numeric_limits<int>::min();
 
        if (minInt <= llNumber && llNumber <= maxInt) {
            append(fieldName, static_cast<int>(llNumber));
        } else {
            append(fieldName, llNumber);
        }
 
        return static_cast<Derived&>(*this);
    }

https://github.com/mongodb/mongo/blob/v5.0/src/mongo/bson/bsonobjbuilder.h#L268-L279

Given that current drivers support MongoDB 3.6 and later, I will clarify that the connectionId returned from the server might be an int32, double, or int64.

Comment by Kevin Albertson [ 27/Mar/23 ]

SERVER-75293 reports a case where connectionId is returned as Double

Comment by James Kovacs [ 17/Nov/22 ]

The CMAP spec specifies the connectionId as Number without any indication of bitness. Command Logging and Monitoring does indicate that the serverConnectionId is an Int32. This should be corrected to Int64. Other specs should be audited to ensure that they correctly indicate that connectionId is an Int64.

Generated at Thu Feb 08 08:25:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.