[SERVER-17435] Invariant failure when killing user ops during stepdown Created: 02/Mar/15  Updated: 19/May/17  Resolved: 06/Mar/15

Status: Closed
Project: Core Server
Component/s: Concurrency, Replication
Affects Version/s: 3.0.0-rc11
Fix Version/s: 3.0.16, 3.1.0

Type: Bug Priority: Major - P3
Reporter: Kamran K. Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: 28qa, UT
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-15310 kill all operations before attempting... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:
Linked BF Score: 0

 Description   

During stepdown, user operations are killed (SERVER-15310). The killing of these ops can occasionally result in an invariant failure when a client op is not found.

This bug seems like it could be related to SERVER-16506.

Invariant failure found src/mongo/db/global_environment_d.cpp 252
 
#0  0x00007f6315a4720b in raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00000000018ce2d2 in mongo::breakpoint () at src/mongo/util/debugger.cpp:63
#2  0x00000000018c3166 in mongo::invariantFailed (expr=0x1fc12ba "found", file=0x1fc1178 "src/mongo/db/global_environment_d.cpp", line=252) at src/mongo/util/assert_util.cpp:147
#3  0x000000000141a6da in mongo::GlobalEnvironmentMongoD::killAllUserOperations (this=0x3b37c60, txn=0x7f62eddcf7d0) at src/mongo/db/global_environment_d.cpp:252
#4  0x000000000160be23 in mongo::repl::ReplicationCoordinatorExternalStateImpl::killAllUserOperations (this=0x3e65900, txn=0x7f62eddcf7d0) at src/mongo/db/repl/replication_coordinator_external_state_impl.cpp:236
#5  0x0000000001612e5d in mongo::repl::ReplicationCoordinatorImpl::stepDown (this=0x3b6cd00, txn=0x7f62eddcf7d0, force=true, waitTime=..., stepdownTime=...) at src/mongo/db/repl/replication_coordinator_impl.cpp:1029
#6  0x0000000001651ed9 in mongo::repl::CmdReplSetStepDown::run (this=0x2882b40 <mongo::repl::cmdReplSetStepDown>, txn=0x7f62eddcf7d0, cmdObj=..., errmsg=..., result=..., fromRepl=false) at src/mongo/db/repl/replset_commands.cpp:428
#7  0x0000000001339fe9 in mongo::_execCommand (txn=0x7f62eddcf7d0, c=0x2882b40 <mongo::repl::cmdReplSetStepDown>, dbname=..., cmdObj=..., queryOptions=0, errmsg=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1317
#8  0x000000000133af66 in mongo::Command::execCommand (txn=0x7f62eddcf7d0, c=0x2882b40 <mongo::repl::cmdReplSetStepDown>, queryOptions=0, cmdns=0x786a814 "admin.$cmd", cmdObj=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1533
#9  0x000000000133b845 in mongo::_runCommands (txn=0x7f62eddcf7d0, ns=0x786a814 "admin.$cmd", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1605
#10 0x000000000153f784 in mongo::runCommands (txn=0x7f62eddcf7d0, ns=0x786a814 "admin.$cmd", jsobj=..., curop=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/query/find.cpp:137
#11 0x00000000015417ac in mongo::runQuery (txn=0x7f62eddcf7d0, m=..., q=..., nss=..., curop=..., result=..., fromDBDirectClient=false) at src/mongo/db/query/find.cpp:606
#12 0x00000000014475be in mongo::receivedQuery (txn=0x7f62eddcf7d0, c=..., dbresponse=..., m=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:220
#13 0x000000000144876e in mongo::assembleResponse (txn=0x7f62eddcf7d0, m=..., dbresponse=..., remote=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:403
#14 0x0000000001142ef8 in mongo::MyMessageHandler::process (this=0x3b04130, m=..., port=0x3e43700, le=0x3e41db0) at src/mongo/db/db.cpp:206
#15 0x00000000018ec4c6 in mongo::PortMessageServer::handleIncomingMsg (arg=0x3e43700) at src/mongo/util/net/message_server_port.cpp:229
#16 0x00007f6315a3f182 in start_thread (arg=0x7f62eddd0700) at pthread_create.c:312
#17 0x00007f6314b4047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111



 Comments   
Comment by Githook User [ 19/May/17 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-17435 do not abort if opid changes while killing all ops in killAllUserOperations()

(cherry picked from commit eb9785c12b4b88d76e00321440c8d635f296448a)
Branch: v3.0
https://github.com/mongodb/mongo/commit/98fdade3b05d4221d74cc07381dccace732abf48

Comment by Githook User [ 06/Mar/15 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-17435 do not abort if opid changes while killing all ops in killAllUserOperations()
Branch: master
https://github.com/mongodb/mongo/commit/eb9785c12b4b88d76e00321440c8d635f296448a

Generated at Thu Feb 08 03:44:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.