"hello" failpoint in CSOT command-execution UST is premature

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Go Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Detailed steps to reproduce the problem?

      Running the following unified spec test in the Go Driver will eventually result in an error where the maxTimeMS on a command is 1 millisecond higher than expected:

      go test -run TestUnifiedSpec/client-side-operations-timeout/command-execution.json/maxTimeMS_value_in_the_command_is_less_than_timeoutMS -v -count=100 -failfast
      

      error:

          unified_spec_runner.go:146: events verification failed at index 0: event comparison failed at index 1: error comparing command documents: comparison error at key "maxTimeMS": error doing special matching assertion: expected numeric value 451 to be less than or equal 450; all events found for client:
      
              Started Events
      
              [localhost:27017[-74]] {"insert": "timeoutColl","ordered": true,"writeConcern": {"w": {"$numberInt":"1"}},"lsid": {"id": {"$binary":{"base64":"Gtya3oUDTuKmRdyc0bU+vQ==","subType":"04"}}},"txnNumber": {"$numberLong":"1"},"$clusterTime": {"clusterTime": {"$timestamp":{"t":1706248519,"i":6}},"signature": {"hash": {"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"00"}},"keyId": {"$numberLong":"0"}}},"maxTimeMS": {"$numberLong":"99842"},"$db": "test","documents": [{"_id": {"$numberInt":"1"}}]}
              [localhost:27017[-74]] {"insert": "timeoutColl","ordered": true,"writeConcern": {"w": {"$numberInt":"1"}},"lsid": {"id": {"$binary":{"base64":"Gtya3oUDTuKmRdyc0bU+vQ==","subType":"04"}}},"txnNumber": {"$numberLong":"2"},"$clusterTime": {"clusterTime": {"$timestamp":{"t":1706248521,"i":1}},"signature": {"hash": {"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"00"}},"keyId": {"$numberLong":"0"}}},"maxTimeMS": {"$numberLong":"451"},"$db": "test","documents": [{"_id": {"$numberInt":"2"}}]}
      
              Succeeded Events
      
      
              Failed Events
      
      
              Pool Events
      
      
      --- FAIL: TestUnifiedSpec (2.36s)
          --- FAIL: TestUnifiedSpec/client-side-operations-timeout (2.35s)
              --- FAIL: TestUnifiedSpec/client-side-operations-timeout/command-execution.json (2.35s)
                  --- FAIL: TestUnifiedSpec/client-side-operations-timeout/command-execution.json/maxTimeMS_value_in_the_command_is_less_than_timeoutMS (2.35s)
      FAIL
      exit status 1
      FAIL    go.mongodb.org/mongo-driver/internal/integration/unified        10.369s
      

      This occurs because the way we calculate the maxTimeMS in the Go Driver (as per the specifications) is dependent on the min RTT sample:

      int64((remainingTimeout - rttMin + time.Millisecond - 1) / time.Millisecond)
      

      And for some reason, this value (rttMin) is less than 50 despite the fact that the test is specifically designed to block for 50 milliseconds: https://github.com/mongodb/specifications/blob/master/source/client-side-operations-timeout/tests/command-execution.yml#L45

      The value for the hello RTT sample is calculated here. If you slightly alter this code to the following and run the above test, you will see that occasionally the sample is less than 50:

      c.helloRTT = time.Since(handshakeStartTime)
      
      if c.helloRTT >= 49*time.Millisecond && c.helloRTT < 50*time.Millisecond {
          fmt.Println("rtt: ", c.helloRTT)
      }
      

      This is pretty inexplicable. It is not likely that there is a server bug. You can test this in mongosh using the following:

      // Set up the failpoint configuration
      db.adminCommand({
          configureFailPoint: 'failCommand',
          mode: 'alwaysOn',
          data: {
              failCommands: ['hello', 'isMaster'],
              blockConnection: true,
              blockTimeMS: 50 // Reduced to 50 milliseconds
          }
      });
      
      // Function to measure the time for db.runCommand({ hello: 1 })
      function measureTimeForHello() {
          // Record the start time
          const startTime = new Date();
      
          // Execute db.runCommand
          db.runCommand({ hello: 1 });
      
          // Record the end time
          const endTime = new Date();
      
          // Calculate the time difference
          const timeTaken = endTime - startTime;
      
          // Return the time in milliseconds
          return timeTaken;
      }
      
      // Example: Iterate over the function and print only if timeResult is < 55 ms (10,000 iterations)
      for (let i = 0; i < 10000; i++) {
          const timeResult = measureTimeForHello();
      
          if (timeResult < 50) {
              print(`Iteration ${i + 1}: Time taken: ${timeResult} milliseconds`);
          }
      }
      

      Furthermore you can test that the Go duration logic is sound in this playground: https://go.dev/play/p/NN3LuHPVeSR

      Definition of done: what must be done to consider the task complete?

      Ensure that the UST in question passes.

      The exact Go version used, with patch level:

      go version go1.21.4 darwin/arm64

      The exact version of the Go driver used:

      tested and confirmed on 5.0.23 & latest

      Describe how MongoDB is set up. Local vs Hosted, version, topology, load balanced, etc.

      Using mongo orchestration:

      export MONGO_ORCHESTRATION_HOME=/Users/preston.vasquez/Developer/mongo-orchestration-home 
      export STPATH=/Users/preston.vasquez/Developer/mongo-go-driver
      
      mongo-orchestration stop --pidfile=/Users/preston.vasquez/Developer/mongo-orchestration-home/server.pid
      
      ${DRIVERS_TOOLS}/.evergreen/start-orchestration.sh /Users/preston.vasquez/Developer/mongo-orchestration-home
      
      echo "sending security token configuraiton to the server..."
      
      curl -X POST http://localhost:8889/v1/replica_sets \
      	-H 'Content-Type: application/json' \
      	-d @${DRIVERS_TOOLS}/.evergreen/orchestration/configs/replica_sets/basic.json
      
      {"releases":{"default":"/Users/preston.vasquez/.local/m/versions/5.0.23/bin"}}
      

      The operating system and version (e.g. Windows 7, OSX 10.8, ...)

      macOS 14.2.1 (23C71)

      Security Vulnerabilities

      NA

            Assignee:
            Unassigned
            Reporter:
            Preston Vasquez
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: