[SERVER-50228] Convert ThreadPool to use predicated waits Created: 10/Aug/20  Updated: 29/Oct/23  Resolved: 10/Sep/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.8.0

Type: Improvement Priority: Major - P3
Reporter: Benjamin Caimano (Inactive) Assignee: Billy Donahue
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-47554 Replica Set member suddenly stopped r... Closed
Backwards Compatibility: Fully Compatible
Sprint: Service arch 2020-09-21
Participants:
Linked BF Score: 114

 Description   

ThreadPool uses non-predicated cond_var waits here and here. In both these cases, we can convert existing predicates to be used with the waits. This will make it substantially harder to miss notifications.



 Comments   
Comment by Anton Neznaienko [ 06/Apr/21 ]

I don't think that this issue fixes replication hangs in https://jira.mongodb.org/browse/SERVER-47554

I've added a comment about upstream glibc bug in 2.27+ that was not fixed yet and seems to cause same pthread_cond_wait missing notifications in other programs.

Not sure if anything can be done on MongoDB side unless glibc is fixed.

The only "easy" solution for this would be to use OS that ships with glibc < 2.27 like Debian 9 or CentOS 7.

Comment by Billy Donahue [ 05/Apr/21 ]

You called this a fix, so I want to confirm with you that this is isn't actually a fix of anything. I mean there's no identified fault that this corrects. It's just a simplification made in hopes of eliminating hiding places for bugs.

Comment by venkataramans rama [ 05/Apr/21 ]

Hi Billy, 

We are running into an issue  which is very similar to  SERVER-47554 in 3.6 and 4.0. So want to back-port this fix in 4.0 and 4.2

 

 

Comment by Billy Donahue [ 05/Apr/21 ]

I don't think some of these changes will go in cleanly.
What would be the motivation for the backport?

Comment by venkataramans rama [ 05/Apr/21 ]

Can this be back-ported to 4.0 and 4.2? 

Comment by Githook User [ 11/Sep/20 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-50228 ThreadPool predicate condvar wait

Comment by Githook User [ 10/Sep/20 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-50228 ThreadPool predicate condvar wait

Generated at Thu Feb 08 05:22:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.