Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-2798

Gossiping the cluster time from monitoring connections can result in loss of availability

    • Needed
    • Hide

      Summary of necessary driver changes

      •  

      Commits for syncing spec/prose tests
      (and/or refer to an existing language POC if needed)

      •  

      Context for other referenced/linked tickets

      •  
      Show
      Summary of necessary driver changes   Commits for syncing spec/prose tests (and/or refer to an existing language POC if needed)   Context for other referenced/linked tickets  
    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      CDRIVER-5643 Blocked
      CXX-3079 Blocked
      CSHARP-5204 Blocked
      GODRIVER-3288 Blocked
      JAVA-5546 Blocked
      NODE-6293 Blocked
      MOTOR-1347 Blocked
      PYTHON-4579 Blocked
      PHPLIB-1495 Blocked
      RUBY-3523 Blocked
      RUST-2005 Blocked
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } #scriptField td.willNotDo { background-color: #FF0000; /* Red color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion CDRIVER-5643 Blocked CXX-3079 Blocked CSHARP-5204 Blocked GODRIVER-3288 Blocked JAVA-5546 Blocked NODE-6293 Blocked MOTOR-1347 Blocked PYTHON-4579 Blocked PHPLIB-1495 Blocked RUBY-3523 Blocked RUST-2005 Blocked

      Summary

      In unusual situations, gossiping the cluster time received on monitoring connections results in complete loss of availability and requires an application restart. The problem was traced to a temporary state during which the driver attempts to connect to a member of the wrong replica set running on the same pod.  Since cluster times between deployments are not compatible, it results in all operations failing until the application is restarted.

      Motivation

      Who is the affected end user?

      We only have one report of this, in JAVA-5256.  Please see that ticket for details, as they are quite involved.

      How does this affect the end user?

      Availability is completely compromised and an application restart is required.

      How likely is it that this problem or use case will occur?

      It's certainly unusual, as we have not heard other reports of this from people using our Kubernetes operator.  On the other hand, the fix is likely simple for most drivers, though testing is an issue (there are probably no tests of the existing behavior)

      If the problem does occur, what are the consequences and how severe are they?

      Complete loss of availability to the desired cluster.

      Is this issue urgent?

      The user has no simple workaround, but it is possible to work around

      Is this ticket required by a downstream team?

      No

      Is this ticket only for tests?

      No

      Acceptance Criteria

      The requirement is for a clarification to the sessions specification, saying that cluster time gossiping should be limited to pooled connections and should not include monitoring connections.  It's unclear though how a test could be written.  In a POC of this in the Java driver, it was achieved by a simple design change that made it impossible to gossip the cluster time for monitoring connections, but it's certainly possible that a future design change could reverse that and the issue could be re-introduced.

      Additional Notes

      Gossiping of cluster time has been a bit of a mystery to many driver engineers, as the specification contains no rationale for it. Discussions with server engineers recently have revealed the following justification:

      • In a sharded cluster, each shard has an independent monotonically increasing logical clock
      • Every write on the shard includes the current logical clock time
      • The gossiping pushes the logical clock forward to just past the gossiped time
      • This means that a client thread that does a write that targets shard A, then a subsequent write to shard B, will result in the second write having a later time than the first write
      • This in turn means that the first write will precede the second write in various operations which create a total ordering of write operations. A change stream is the primary example.

      Since monitoring connections are never used for writes, there is no benefit to gossiping cluster times from those connections

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            jeff.yemin@mongodb.com Jeffrey Yemin
            Jeffrey Yemin Jeffrey Yemin
            KeAna Moutra KeAna Moutra
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: