[SERVER-75293] Different return types for the connectionId Created: 26/Mar/23  Updated: 21/Apr/23  Resolved: 21/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.18
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Vinicius Grippa Assignee: Dianna Hohensee (Inactive)
Resolution: Won't Do Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
causes CDRIVER-4593 C Driver fails to validate double typ... Closed
Related
related to DRIVERS-2503 ConnectionId returned in heartbeats m... Implementing
related to SERVER-43762 tighten the overload set for BSONObjB... Closed
Assigned Teams:
Storage Execution
Operating System: ALL
Sprint: Execution Team 2023-05-01
Participants:

 Description   

This issue started affecting several drivers when connectionId was implemented in the return of the hello() command.

The general ticket of the driver issue is this:
https://jira.mongodb.org/browse/DRIVERS-2503

The problem with connectioId is that the type returned can change, causing exceptions on the driver side. An example is the following ticket:
https://jira.mongodb.org/browse/CDRIVER-4593
Mongo can return the connectionId as int, int64, or double. In the ticket above, tcpdump is showing the double type.

When looking at Mongo code we can see that appendNumber function:
https://github.com/mongodb/mongo/blob/r4.4.14/src/mongo/s/commands/cluster_is_master_cmd.cpp#L189

        result.appendNumber("maxBsonObjectSize", BSONObjMaxUserSize);
        result.appendNumber("maxMessageSizeBytes", MaxMessageSizeBytes);
        result.appendNumber("maxWriteBatchSize", write_ops::kMaxWriteBatchSize);
        result.appendDate("localTime", jsTime());
        result.append("logicalSessionTimeoutMinutes", localLogicalSessionTimeoutMinutes);
        result.appendNumber("connectionId", opCtx->getClient()->getConnectionId());

If we look at the appendNumber:
https://github.com/mongodb/mongo/blob/86b88644407bc94cf4358434d221703d234d75c7/db/jsobj.h

We can see:

        /**
         * appendNumber is a series of method for appending the smallest sensible type
         * mostly for JS
         */
        void appendNumber( const string& fieldName , int n ){
            append( fieldName.c_str() , n );
        }
 
        void appendNumber( const string& fieldName , double d ){
            append( fieldName.c_str() , d );
        }
 
        void appendNumber( const string& fieldName , long long l ){
            static long long maxInt = (int)pow( 2.0 , 30.0 );
            static long long maxDouble = (long long)pow( 2.0 , 40.0 );
 
            if ( l < maxInt )
                append( fieldName.c_str() , (int)l );
            else if ( l < maxDouble )
                append( fieldName.c_str() , (double)l );
            else
                append( fieldName.c_str() , l );
        }
        

So, Mongo needs to decide where to fix this issue. If this is expected behavior then this fix needs to be accepted:
https://jira.mongodb.org/browse/CDRIVER-4593

Otherwise, if it is a server issue, the upstream code needs to provide a single type to return the connectionId to avoid these driver exceptions.

From what I see in this bug, it gives the impression no one was expecting Mongo running for so long without the need to be restarted.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 21/Apr/23 ]

In master the cluster_hello_cmd.cpp uses appendNumber, which as Chris Kelly noted can only return int or LL as of SERVER-43762 in v5.0: a double is no longer possible v5.0+.

It looks like the Driver issue in DRIVERS-2503 is being fixed independently of any server changes, since Drivers must support all versions. It won't be helpful to Drivers to backport SERVER-43762 to v4.4, and the problem is fixed going forward. Closing this ticket as Won't Do.

Comment by Chris Kelly [ 29/Mar/23 ]

Thanks for your report!

I think this may have been fixed in SERVER-43762 since it appears the double was removed there, however that applies to 5.0+. I'll double check with one of our teams that may be familiar with this part of the code.

Generated at Thu Feb 08 06:29:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.