[SERVER-36868] Update error code in the stepdown hook. Created: 24/Aug/18  Updated: 27/Oct/23  Resolved: 22/Dec/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 4.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Gone away Votes: 0
Labels: stm
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-36817 replSetFreeze command run by stepdown... Closed
is related to SERVER-35031 ExceededTimeLimit (50) is reported in... Closed
Assigned Teams:
Server Tooling & Methods
Operating System: ALL
Participants:
Linked BF Score: 22

 Description   

The error code for exceeding timeouts was determined to be ambiguous and was changed in SERVER-35031. We should update places that expect these errors in resmoke to handle the new error code. Right now the only place is in the stepdown hook.

Note that there has been no change in pymongo throughout this process; it doesn't know about the new error code for ExceededTimeLimit and treats it as a generic OperationFailure. So the error handling code in stepdown.py should be changed to handle OperationFailure as well.

We should also check that the code for the OperationFailure is indeed for ExceededTimeLimit(262) and still bubble up other errors.

Affects master and 4.0 branches for now, but based on comments in SERVER-35031, it will be backported to earlier branches later.



 Comments   
Comment by Jonathan Abrahams [ 28/Aug/18 ]

This is no longer required, based on the work done in SERVER-36817

Comment by Robert Guo (Inactive) [ 25/Aug/18 ]

max.hirschhorn Definitely not. The reason the code for ExceededTimeLimit was changed in the first place was to avoid confusion with maxTimeMS errors. Exceeding maxTimeMS is the only legitimate reason for ExecutionTimeout errors; ExceededTimeLimit/262 errors on the other hand should not map to ExecutionTimeout.

Comment by Max Hirschhorn [ 24/Aug/18 ]

Max Hirschhorn We shouldn't need to. There has been no change to PyMongo in this process, The error code change is only on the server side

Right, but couldn't there be a future version of PyMongo that maps error code 262 to the ExecutionTimeout exception?

Comment by Robert Guo (Inactive) [ 24/Aug/18 ]

max.hirschhorn We shouldn't need to. There has been no change to PyMongo in this process, The error code change is only on the server side. The server used to return error code 50, but now returns error code 262 after SERVER-35031.

So as long as the engineer is running a recent version of the server, we will only see 262, regardless of the version of pymongo.

Comment by Max Hirschhorn [ 24/Aug/18 ]

robert.guo, should we still handle the ExecutionTimeout exception in the stepdown thread? I wasn't sure what approach would support the possible versions of PyMongo that Server engineers might be running locally.

Generated at Thu Feb 08 04:44:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.