-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.0.23
-
Component/s: None
-
Labels:
-
Service Arch
-
ALL
-
4
While reproducing the production incident, it was determined and repeatedly reproduced that once mongos gets the firstĀ "pthread_create failed" error, it will not recover to the working state anymore. In some experiments, mongos was able to survive in this dysfunctional state for over 20 minutes.
The proper behavior is to start graceful shutdown, drop active sessions, prevent new connections, exit when done or after 1 minute. Please remember that adding a shutdown thread is impossible, the termination procedure should complete in existing thread.
I apologize for not knowing if the scope of this fix is not too much for one ticket, I would like to know more about our termination facilities in mongos.
Mongod can be a separate ticket if needed.