[SERVER-84439] appendMajorityWriteConcern should only override wtimeout if it is non-zero Created: 28/Dec/23 Updated: 05/Jan/24 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Backlog - Service Architecture |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Service Arch
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Service Arch Prioritized List | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
wtimeout of zero has a special meaning that it's no timeout. So, logically it has a greater value than anything, but we use numerical comparisons in the places below: https://github.com/10gen/mongo/blob/231d6367ea3c7670a458ce00252d970979d7f5be/src/mongo/executor/async_rpc_util.h#L74 So, it ends up being treated as a smaller number. As a result, a no timeout setting can be overwritten with a smaller timeout. |
| Comments |
| Comment by Alex Li [ 05/Jan/24 ] |
|
To summarize context: The recent changes of the appendMajorityWriteConcern method itself preserved the logic that it had previously. The sendCommandToShards refactor changed using write concern without a timeout field ({ w: "majority" }) to using the file local kMajorityWriteConcern ({ w: "majority", wtimeout: 0 }). But because the appendMajorityWriteConcern method is used, the resharding local kMajorityWriteConcern is used, but the timeout field is rewritten to a default minimum ({ w: "majority", wtimeout: 60000 }). The appendMajorityWriteConcern method was not used before the refactor. To restate fixes: 1) Add to the minimum timeout if condition that if the timeout is 0, do not overwrite. 2) It was discussed that creating a new Timeout type may not fit other areas and could be overkill, but considering the wTimeout has both kNoTimeout{0} and kNoWaiting{-1}, it may be worth creating a WriteConcernOptions nested type to make no timeout and no waiting more obvious. 3) May be more work but IMO it could be worth changing this function's logic to be more simple (refactor other current uses as well). |
| Comment by Randolph Tan [ 29/Dec/23 ] |
|
What if we create a new type Timeout<Duration> that has the method hasTimeout and overloads the comparison operators so we won't make mistakes like this in the future? |