[SERVER-81068] Support feature discovery in drivers when connected to router endpoint Created: 14/Sep/23  Updated: 06/Feb/24

Status: Open
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Antonio Fuschetto
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-80771 Implement cluster-wide maxWireVersion... Closed
Assigned Teams:
Sharding EMEA
Sprint: Sharding EMEA 2023-10-02, Sharding EMEA 2023-10-16, Sharding EMEA 2023-10-30, CAR Team 2023-11-13, CAR Team 2023-11-27, CAR Team 2023-12-11, CAR Team 2023-12-25, CAR Team 2024-01-08, CAR Team 2024-01-22, CAR Team 2024-02-05, CAR Team 2024-02-19
Participants:

 Description   
Motivation

Drivers have relied on checking the maxWireVersion reported in the server's hello response to determine whether a particular feature may be used. In most situations new features are tied to new application code changes, however, there have been some automatic driver behaviors which relied on server behavior without any application code changes being made:

  • Attaching logical session IDs (lsids) to every request. (MongoDB 3.4)
  • Retryable writes enabled by default in logical sessions. (MongoDB 4.2)
  • $out and $merge can target a secondary. (MongoDB 4.4)

The upgrade procedure has the router mongos processes upgraded entirely after the shard and config server mongod processes. And then after all the binaries have been upgraded is the setFeatureCompatibilityVersion command run. But mongos always reports maxWireVersion == LATEST_WIRE_VERSION independently of the cluster's feature compatibility version. This means when drivers use the maxWireVersion to determine whether a new feature/behavior can be used by the driver, it remains possible for the server to return an error due to the feature compatibility version not having been fully upgraded yet.

Historically it was said clusters won't spend a long period of time in the binVersion upgraded and FCV last-stable state, and so any application errors which occur against sharded clusters would be transient. With the new downgrade policy this assumption is not valid and we can reasonably expect clusters to spend 1-2 weeks in the binVersion upgraded and FCV last-stable state.

Edit: Note that this same issue of maxWireVersion not being dependent in part on the cluster's feature compatibility version also affects replica sets. However, with the direction for all clusters to be behind the router layer, supporting feature discovery in drivers for current replica sets isn't needed.

Proposal

We should introduce a mechanism for router processes (whether dedicated mongos or embedded within mongod) to report a maxWireVersion which reflects the cluster's feature compatibility version.


Generated at Thu Feb 08 06:45:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.