[SERVER-17281] Segfault during id_hack FSM test Created: 13/Feb/15  Updated: 18/Sep/15  Resolved: 17/Feb/15

Status: Closed
Project: Core Server
Component/s: Concurrency, Querying, Replication
Affects Version/s: 3.1.0
Fix Version/s: 3.0.0-rc9, 3.1.0

Type: Bug Priority: Critical - P2
Reporter: Charlie Swanson Assignee: David Storch
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

Occasionally reproduce locally. The easiest way to run the same test is to modify fsm_all_replication.js to only include fsm_workloads/yield_id_hack.js.

Participants:

 Description   

FSM test fails in a replicated environment with this failure. The FSM test interleaves updates, deletes, inserts, and querying with multiple threads. The query used is specifically on _id, to use the ID_HACK query phase.

m31000| 2015-02-12T22:17:57.703+0000 F -        [conn730] Invalid access at address: 0
 m31000| 2015-02-12T22:17:57.711+0000 F -        [conn730] Got signal: 11 (Segmentation fault).
 m31000| 
 m31000|  0xf5af59 0xf5a5d2 0xf5a92e 0x2b7a07481ca0 0x91406b 0xa46fd4 0x91c1c2 0x9149d2 0xa0a130 0xbe14c4 0xbe1944 0xbe24cd 0xabe3c1 0xac28ab 0x80d4b2 0xf1975b 0x2b7a0747983d 0x2b7a08303fcd
 m31000| ----- BEGIN BACKTRACE -----
 m31000| {"backtrace":[{"b":"400000","o":"B5AF59"},{"b":"400000","o":"B5A5D2"},{"b":"400000","o":"B5A92E"},{"b":"2B7A07473000","o":"ECA0"},{"b":"400000","o":"51406B"},{"b":"400000","o":"646FD4"},{"b":"400000","o":"51C1C2"},{"b":"400000","o":"5149D2"},{"b":"400000","o":"60A130"},{"b":"400000","o":"7E14C4"},{"b":"400000","o":"7E1944"},{"b":"400000","o":"7E24CD"},{"b":"400000","o":"6BE3C1"},{"b":"400000","o":"6C28AB"},{"b":"400000","o":"40D4B2"},{"b":"400000","o":"B1975B"},{"b":"2B7A07473000","o":"683D"},{"b":"2B7A0822F000","o":"D4FCD"}],"processInfo":{ "mongodbVersion" : "3.1.0-pre-", "gitVersion" : "6ca5a81e340f96502bc5f530a8a6fa0d44fea052", "uname" : { "sysname" : "Linux", "release" : "2.6.18-194.el5xen", "version" : "#1 SMP Tue Mar 16 22:01:26 EDT 2010", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "2B7A07473000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "2B7A0768F000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "2B7A07898000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "2B7A07A9D000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "2B7A07D9D000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "2B7A08020000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "2B7A0822F000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "2B7A07255000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
 m31000|  mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5af59]
 m31000|  mongod(+0xB5A5D2) [0xf5a5d2]
 m31000|  mongod(+0xB5A92E) [0xf5a92e]
 m31000|  libpthread.so.0(+0xECA0) [0x2b7a07481ca0]
 m31000|  mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x3B) [0x91406b]
 m31000|  mongod(_ZN5mongo16WorkingSetCommon21fetchAndInvalidateLocEPNS_16OperationContextEPNS_16WorkingSetMemberEPKNS_10CollectionE+0x54) [0xa46fd4]
 m31000|  mongod(_ZN5mongo13CursorManager18invalidateDocumentEPNS_16OperationContextERKNS_8RecordIdENS_16InvalidationTypeE+0x62) [0x91c1c2]
 m31000|  mongod(_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEbbPNS_7BSONObjE+0x92) [0x9149d2]
 m31000|  mongod(_ZN5mongo11DeleteStage4workEPm+0x2E0) [0xa0a130]
 m31000|  mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xB4) [0xbe14c4]
 m31000|  mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbe1944]
 m31000|  mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbe24cd]
 m31000|  mongod(_ZN5mongo14receivedDeleteEPNS_16OperationContextERNS_7MessageERNS_5CurOpE+0x3F1) [0xabe3c1]
 m31000|  mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0x51B) [0xac28ab]
 m31000|  mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xE2) [0x80d4b2]
 m31000|  mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x32B) [0xf1975b]
 m31000|  libpthread.so.0(+0x683D) [0x2b7a0747983d]
 m31000|  libc.so.6(clone+0x6D) [0x2b7a08303fcd]
 m31000| -----  END BACKTRACE  -----



 Comments   
Comment by Githook User [ 17/Feb/15 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-17281 IDHackStage::invalidate() dereference correct OpCtx ptr
Branch: master
https://github.com/mongodb/mongo/commit/48e7b856f2b336537ca560ae9ab1740a293b53b9

Comment by Githook User [ 17/Feb/15 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-17281 IDHackStage::invalidate() dereference correct OpCtx ptr

(cherry picked from commit 48e7b856f2b336537ca560ae9ab1740a293b53b9)
Branch: v3.0
https://github.com/mongodb/mongo/commit/5363aaf371338880c1f9518a6cc86817cc92d3b9

Comment by David Storch [ 17/Feb/15 ]

This issue affects MMAP v1 only. IDHackStage::invalidate() is using the wrong OperationContext pointer to force-fetch invalidated documents. It is attempting to use its own OperationContext pointer, but this pointer was set to NULL on saveState. Instead, it should use the one passed as an argument to invalidate().

I audited the query execution stages to see if any other stage suffers from the same bug, but I did not find any problems.

Generated at Thu Feb 08 03:43:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.