[SERVER-17403] logOp fassert during stagedebug delete on stepped down primary Created: 27/Feb/15  Updated: 19/Sep/15  Resolved: 27/Feb/15

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: 3.0.0-rc11
Fix Version/s: 3.0.0, 3.1.0

Type: Bug Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: J Rassi
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2012 R2


Issue Links:
Related
is related to SERVER-17413 Various improvements to stageDebug co... Open
Backwards Compatibility: Fully Compatible
Operating System: Windows
Sprint: Quint 3.1.0
Participants:

 Description   

2015-02-26T15:05:50.892-0800 F REPL     [conn96] replSet error : logOp() but can't accept write to collection test.stages_delete
2015-02-26T15:05:50.892-0800 I -        [conn96] Fatal Assertion 17405
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\util\stacktrace_win.cpp(175)                       mongo::printStackTrace+0x43
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\util\log.cpp(135)                                  mongo::logContext+0x8f
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\util\assert_util.cpp(165)                          mongo::fassertFailed+0x77
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\repl\oplog.cpp(238)                             mongo::repl::`anonymous namespace'::_logOpRS+0x2b5
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\repl\oplog.cpp(380)                             mongo::repl::logOp+0x8f
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\exec\delete.cpp(158)                            mongo::DeleteStage::work+0x31e
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\exec\fetch.cpp(94)                              mongo::FetchStage::work+0xea
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\query\plan_executor.cpp(328)                    mongo::PlanExecutor::getNextSnapshotted+0xdb
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\query\plan_executor.cpp(286)                    mongo::PlanExecutor::getNext+0x41
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\exec\stagedebug_cmd.cpp(163)                    mongo::StageDebugCmd::run+0x42d
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\dbcommands.cpp(1317)                            mongo::_execCommand+0x72
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\dbcommands.cpp(1535)                            mongo::Command::execCommand+0xd7c
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\dbcommands.cpp(1607)                            mongo::_runCommands+0x434
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\query\find.cpp(137)                             mongo::runCommands+0x43
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\query\find.cpp(606)                             mongo::runQuery+0x3b3
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\instance.cpp(220)                               mongo::receivedQuery+0x378
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\instance.cpp(403)                               mongo::assembleResponse+0x350
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\db\db.cpp(206)                                     mongo::MyMessageHandler::process+0xbd
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(231)              mongo::PortMessageServer::handleIncomingMsg+0x553
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185)  boost::`anonymous namespace'::thread_start_function+0x21
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(376)                 _callthreadstartex+0x17
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] mongod.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(354)                 _threadstartex+0x102
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96] KERNEL32.DLL                                                                   BaseThreadInitThunk+0xd
2015-02-26T15:05:51.304-0800 I CONTROL  [conn96]
2015-02-26T15:05:51.304-0800 I -        [conn96]
 
***aborting after fassert() failure

hash: 7007e0e571b859f3aed79a61c6821cc552fb2685



 Comments   
Comment by Githook User [ 27/Feb/15 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-17403 Fix StageDebugCmd command registration

(cherry picked from commit 7d588bd39adc253e0b32df0049569888d392159b)
Branch: v3.0
https://github.com/mongodb/mongo/commit/d849d158c195d9f3e43090111306968cc3f435d8

Comment by Githook User [ 27/Feb/15 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-17403 Fix StageDebugCmd command registration
Branch: master
https://github.com/mongodb/mongo/commit/7d588bd39adc253e0b32df0049569888d392159b

Comment by J Rassi [ 27/Feb/15 ]

The cause of this issue is that StageDebugCmd::slaveOk() should return false, but it returns true. This allows the user to run a write operation (via manual configuration of a DeleteStage with stageDebug) on a replica set member in state SECONDARY, which triggers the fassert (so, to be clear: this is not a concurrency-related issue). And, Andy: the delete stage was added during the 2.7 dev cycle, so this issue doesn't affect 2.6.

"stageDebug" is a testing-only command, and it's an oversight that it's not actually registered as a test command. I can't even think of a good reason for this command to exist: test coverage for stages lives in dbtests/query_stage_*.cpp, and the the tests in jstests/core/stages_*.js appears to largely duplicate this coverage. After fixing the immediate issue with slaveOk(), I propose we create a separate ticket to merge stages_*.js into query_stage_*.cpp (assuming the latter is missing any coverage that the former has), and then remove the "stageDebug" command outright. david.storch, do you agree with this proposal?

Generated at Thu Feb 08 03:44:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.