[SERVER-55761] Once "pthread_create failed" error is detected, mongos should be shut down gracefully Created: 02/Apr/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: 4.0.23
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: sa-remove-fv-backlog-22
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Service Arch
Operating System: ALL
Participants:
Story Points: 4

 Description   

While reproducing the production incident, it was determined and repeatedly reproduced that once mongos gets the firstĀ "pthread_create failed" error, it will not recover to the working state anymore. In some experiments, mongos was able to survive in this dysfunctional state for over 20 minutes.

The proper behavior is to start graceful shutdown, drop active sessions, prevent new connections, exit when done or after 1 minute. Please remember that adding a shutdown thread is impossible, the termination procedure should complete in existing thread.

I apologize for not knowing if the scope of this fix is not too much for one ticket, I would like to know more about our termination facilities in mongos.

Mongod can be a separate ticket if needed.


Generated at Thu Feb 08 05:37:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.