[SERVER-56784] The replication thread of secondary hang up Created: 10/May/21 Updated: 27/Oct/23 Resolved: 16/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.0.9, 4.0.19 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | FirstName lipengchong | Assignee: | Dmitry Agranat |
| Resolution: | Community Answered | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
Recently, We encountered a strange phenomenon |
| Comments |
| Comment by FirstName lipengchong [ 11/May/21 ] | ||||||||||||||||||||
|
wow, thanks very much. @Dima | ||||||||||||||||||||
| Comment by Dmitry Agranat [ 10/May/21 ] | ||||||||||||||||||||
|
Thanks lpc for proactively collecting stack traces and providing the rest of the information. Based on this information, we suspect this issue is related to the glibc bug (which is not related to MongoDB). This behavior has only manifest on systems with glibc versions susceptible to this glibc pthread condition variable bug. In other words, this bug impacts glibc versions >= 2.27, and since your version is 2.28, you are impacted by this issue. Even though this bug is not related to MongoDB, we have created
Dima | ||||||||||||||||||||
| Comment by FirstName lipengchong [ 10/May/21 ] | ||||||||||||||||||||
|
it's Debian10.3, glibc version is 2.28
| ||||||||||||||||||||
| Comment by Dmitry Agranat [ 10/May/21 ] | ||||||||||||||||||||
|
Hi lpc, Could you please provide the exact OS version as well as glibc version for the MongoDB server in question? Thanks, | ||||||||||||||||||||
| Comment by FirstName lipengchong [ 10/May/21 ] | ||||||||||||||||||||
|
I am sorry that the format above is orderless
we can know that 16 replWriterThread is waiting for tasks, meaning they are idle。
but the batcher thread is waitForIdle for repl thread.
so i guess there is a bug here, but i don't find what's the root cause of the bug.
|