[SERVER-47775] LOGV2_FATAL failed to print stack trace Created: 24/Apr/20 Updated: 29/Oct/23 Resolved: 25/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.0-rc0 |
| Fix Version/s: | 4.4.1, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Amirsaman Memaripour |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||||||
| Sprint: | Service arch 2020-05-18, Service arch 2020-06-01, Service arch 2020-06-15, Service arch 2020-06-29 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Linked BF Score: | 16 | ||||||||||||||||||||||||
| Description |
|
In my patch build, I ran into a LOG_FATAL and expected it to print out the stack trace, but it didn’t. Here's the log message:
|
| Comments |
| Comment by Githook User [ 18/Aug/20 ] | ||||||||||
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: (cherry picked from commit 5b877cc1bc76fffc70261808c13807bb46ddc05b) | ||||||||||
| Comment by Githook User [ 18/Aug/20 ] | ||||||||||
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: (cherry picked from commit 5b877cc1bc76fffc70261808c13807bb46ddc05b) | ||||||||||
| Comment by Amirsaman Memaripour [ 25/Jun/20 ] | ||||||||||
|
Closing the ticket as Fixed and moving follow-on work to | ||||||||||
| Comment by Amirsaman Memaripour [ 25/Jun/20 ] | ||||||||||
|
milkie, I created bruce.lucas, based on the information in | ||||||||||
| Comment by Eric Milkie [ 25/Jun/20 ] | ||||||||||
|
I don't think so. I actually managed to reproduce this reliably and was able to step through libunwind in the debugger. It seems to just be having trouble finding certain frames in the Elf information for certain libraries, and my sense is that it's some conflict between the way some compilers write out the DWARF info for some code segments and the way that libunwind is parsing that info. | ||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Jun/20 ] | ||||||||||
|
Any chance this is related to | ||||||||||
| Comment by Eric Milkie [ 25/Jun/20 ] | ||||||||||
|
Also what is this ticket blocked on? It already has a commit so it needs to have a real fixVersion and be closed. | ||||||||||
| Comment by Eric Milkie [ 25/Jun/20 ] | ||||||||||
|
Can we backport the above commit into 4.4? | ||||||||||
| Comment by Githook User [ 05/Jun/20 ] | ||||||||||
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: | ||||||||||
| Comment by Eric Milkie [ 03/Jun/20 ] | ||||||||||
|
I discovered that there can sometimes be missing symbols for certain frames, depending on elf format. However, our current code gives up printing anything if even one frame is missing. Luckily, there is a one-line code fix to avoid this, and thus to still print out a portion of the stack trace that was collected, even if resolving some frames produced errors:
| ||||||||||
| Comment by Eric Milkie [ 03/Jun/20 ] | ||||||||||
|
I just encountered this same problem in a local build. I suspect what happened is that my system ran out of free file handles, and the libunwind stack tracer needs to reopen files in order to search for symbols. Unfortunately, libunwind is written in a C style with only a limited number of error codes to return, so there are lots of places in its code that return this particular error status (UNW_ENOINFO). | ||||||||||
| Comment by Githook User [ 05/May/20 ] | ||||||||||
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: |