[SERVER-82659] GDB crashes during debug. Created: 01/Nov/23  Updated: 10/Jan/24  Resolved: 02/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Peter Volk Assignee: William Qian
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Fix
Related
related to SERVER-84735 Add GDB regression test to prevent fr... Open
related to SERVER-83733 CQF in GDB causes secondary crashes d... In Progress
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Repeatable with the following githash: 7f29a2c9da5

1) Build mongod with the following config

./buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_clang.vars --link-model=static --ninja ICECC=icecc CCACHE=ccache 

2) run mongod with the following config:

"/home/ubuntu/mongo/build/install/bin/mongod --setParameter enableTestCommands=1 --setParameter 'logComponentVerbosity={'"'"'replication'"'"': {'"'"'rollback'"'"': 2}, '"'"'sharding'"'"': {'"'"'migration'"'"': 2, '"'"'rangeDeleter'"'"': 2}, '"'"'transaction'"'"': 4, '"'"'tenantMigration'"'"': 4}' --setParameter disableLogicalSessionCacheRefresh=true --setParameter coordinateCommitReturnImmediatelyAfterPersistingDecision=false --setParameter transactionLifetimeLimitSeconds=86400 --setParameter maxIndexBuildDrainBatchSize=10 --setParameter testingDiagnosticsEnabled=true --setParameter disableTransitionFromLatestToLastContinuous=false --bpath=/data/db/job0/resmoke --port=27017 --enableMajorityReadConcern=True --storageEngine=wiredTiger --setParameter featureFlagTimeSeriesInSbe=true --setParameter featureFlagTimeseriesAlwaysUseCompressedBuckets=true --setParameter internalQueryFrameworkControl=forceBonsai --setParameter featureFlagCommonQueryFramework=true" 

3) Attach gdb to the process

4) set a breakpoint at path_utils.cpp:86 then have gdb continue the process (c)

5) connect to the local mongod instance via mongo and submit the following command 

db.peter.find()

 

Sprint: QO 2023-12-25, QO 2024-01-08
Participants:

 Description   

When I build the server as a static binary and then debug with the statement:

 

db.peter.find()  

 

and set a breakpoint at

 

b path_utils.cpp:86

 

gdb crashes with the following error:

 

(gdb) c
Continuing.
[Switching to Thread 0xffff7a94ec00 (LWP 867266)]Thread 33 "conn1" hit Breakpoint 1, _ZN5mongo9optimizer22decomposeToFilterNodesERKNS0_7algebra9PolyValueIJNS0_9BlackholeENS0_8ConstantENS0_8VariableENS0_7UnaryOpENS0_8BinaryOpENS0_2IfENS0_3LetENS0_17LambdaAbstractionENS0_17LambdaApplicationENS0_12FunctionCallENS0_8EvalPathENS0_10EvalFilterENS0_6SourceENS0_12PathConstantENS0_10PathLambdaENS0_12PathIdentityENS0_11PathDefaultENS0_11PathCompareENS0_8PathDropENS0_8PathKeepENS0_7PathObjENS0_7PathArrENS0_12PathTraverseENS0_9PathFieldENS0_7PathGetENS0_12PathComposeMENS0_12PathComposeAENS0_8ScanNodeENS0_16PhysicalScanNodeENS0_13ValueScanNodeENS0_10CoScanNodeENS0_13IndexScanNodeENS0_8SeekNodeENS0_24MemoLogicalDelegatorNodeENS0_25MemoPhysicalDelegatorNodeENS0_10FilterNodeENS0_14EvaluationNodeENS0_12SargableNodeENS0_16RIDIntersectNodeENS0_12RIDUnionNodeENS0_14BinaryJoinNodeENS0_12HashJoinNodeENS0_13MergeJoinNodeENS0_15SortedMergeNodeENS0_18NestedLoopJoinNodeENS0_9UnionNodeENS0_11GroupByNodeENS0_10UnwindNodeENS0_10UniqueNodeENS0_17SpoolProducerNodeENS0_17SpoolConsumerNodeENS0_13CollationNodeENS0_13LimitSkipNodeENS0_12ExchangeNodeENS0_8RootNodeENS0_10ReferencesENS0_16ExpressionBinderEEEES1Q_S1Q_mm (input=Scan["peter_4f5792fa-258e-4954-a02f-a60965984ae3", "p0"], path=PathConstantConstant["true"]
Fatal signal: Segmentation fault
----- Backtrace -----
0x622cab ???
0x709c47 ???
0x709d9b ???
0xffffaaa0e8db ???
0x878b08 ???
0x8e623f ???
0x725b5b ???
0x726713 ???
0x722653 ???
0x725307 ???
0x72552f ???
0x6c3173 ???
0x904b63 ???
0x9052df ???
0x9065cb ???
0x906bfb ???
0x76bf63 ???
0x97e8df ???
0x76db8f ???
0x77882f ???
0xb84743 ???
0xb84b3b ???
0x7b4a13 ???
0x7b61e3 ???
0x55be43 ???
0xffffaa2773fb __libc_start_call_main
        ../sysdeps/nptl/libc_start_call_main.h:58
0xffffaa2774cb __libc_start_main_impl
        ../csu/libc-start.c:392
0x56942f ???
0xffffffffffffffff ???
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.Segmentation fault (core dumped)
 

 

 

The generated coredump is linked as a google drive link (size is ~1gb)



 Comments   
Comment by William Qian [ 02/Jan/24 ]

Recommendation is to use an upgraded version of gdb.

Comment by William Qian [ 12/Dec/23 ]

Upon investigation, this is highly likely a stack overflow within gdb. Comparing the values of the $sp registers at the lowest and highest frames, the difference is around 8MiB, which is the default stack size for Linux. The issue seems to be caused by infinite recursion when unwinding, which is an unusual problem.

Generated at Thu Feb 08 06:49:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.