[SERVER-57772] Failpoints on mongos rewrite state change error codes in writeConcernError Created: 17/Jun/21 Updated: 29/Oct/23 Resolved: 29/Jun/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.6, 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kevin Albertson | Assignee: | Billy Donahue |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||
| Backport Requested: |
v5.0, v4.4, v4.2, v4.0
|
|||||||||||||||||||||||||||
| Steps To Reproduce: | Start a sharded cluster version 5.0.0-alpha0-856-gf4e7955. Using the shell, configure a failpoint on the "insert" command using an error code that represents a server state change.
Results in:
The error code 91 was rewritten to 6 (HostUnreachable). |
|||||||||||||||||||||||||||
| Sprint: | Service Arch 2021-06-28, Service Arch 2021-07-12 | |||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||
| Description |
|
Setting a failpoint on mongos to return a writeConcernError containing a state change error code is rewritten to HostUnreachable (6). I think this is caused by the changes of State change errors indicate the server has changed state (e.g. 91 = ShutdownInProgress or 10107=NotWritablePrimary). Drivers document the state change errors they check for in the Server Discovery and Monitoring specification. Drivers expect failpoints on mongos to return the errors exactly as they are configured. This enables the test scenario of a mongos returning a state change error itself (instead of rewriting one from a backing mongod). The writeConcernError in particular only affects one test in the C driver, and is easy to work around. This is not blocking driver tests currently. |
| Comments |
| Comment by Githook User [ 08/Dec/21 ] | |||||||||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: (cherry picked from commit 7396af4803b0b9b729c457f54defca0c4c51b61f) | |||||||||||||||
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] | |||||||||||||||
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! | |||||||||||||||
| Comment by Githook User [ 17/Sep/21 ] | |||||||||||||||
|
Author: {'name': 'Luis Osta', 'email': 'luis.osta@mongodb.com', 'username': 'LuisOsta'}Message: | |||||||||||||||
| Comment by Githook User [ 29/Jun/21 ] | |||||||||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: | |||||||||||||||
| Comment by Billy Donahue [ 24/Jun/21 ] | |||||||||||||||
|
Code Review: https://mongodbcr.appspot.com/794680001/ | |||||||||||||||
| Comment by Billy Donahue [ 24/Jun/21 ] | |||||||||||||||
|
Oh there's an extra failCommand evaluation site I missed in the rebase to master! The fix is small.
| |||||||||||||||
| Comment by Billy Donahue [ 23/Jun/21 ] | |||||||||||||||
|
This does seem to be unexpected behavior. I am wondering if the failCommand FailPoint was set on the mongos or on a mongod connected to it. |