[SERVER-16599] copydb/clone commands can crash the server if a primary steps down Created: 18/Dec/14  Updated: 25/Feb/15  Resolved: 11/Feb/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.6.6
Fix Version/s: 2.6.8

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: Benety Goh
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-17129 logOp fassert when creating namespace... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

copyDB and clone can yield, but they do not check for primaryship after returning from yield.



 Comments   
Comment by Benety Goh [ 11/Feb/15 ]

The fix in master was completed as part of SERVER-17179:

https://github.com/mongodb/mongo/commit/d4ab26fcfb6d4dc64d7067fd0b5ba08090cf5c59#diff-034210b898e7af60d56a402d87e9559b

Comment by Githook User [ 11/Feb/15 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-16599 do not proceed with copydb/clone if primary steps down
Branch: v2.6
https://github.com/mongodb/mongo/commit/4d9728816460fcd7579071c0f64d0c97032a246d

Comment by Benety Goh [ 11/Feb/15 ]

This is the current fix in master:

https://github.com/mongodb/mongo/commit/d4ab26fcfb6d4dc64d7067fd0b5ba08090cf5c59#diff-034210b898e7af60d56a402d87e9559bR129

The change to be applied to 2.6 will look quite different.

Comment by J Rassi [ 18/Dec/14 ]

Example log output:

2014-12-18T15:09:39.391-0500 [rsHealthPoll] warning: Failed to connect to 127.0.1.1:27018, reason: errno:111 Connection refused
2014-12-18T15:09:39.391-0500 [rsHealthPoll] replset info rassi:27018 heartbeat failed, retrying
2014-12-18T15:09:39.395-0500 [rsHealthPoll] warning: Failed to connect to 127.0.1.1:27018, reason: errno:111 Connection refused
2014-12-18T15:09:39.396-0500 [rsHealthPoll] replSet info rassi:27018 is down (or slow to respond):
2014-12-18T15:09:39.396-0500 [rsHealthPoll] replSet member rassi:27018 is now in state DOWN
2014-12-18T15:09:39.396-0500 [rsMgr] can't see a majority of the set, relinquishing primary
2014-12-18T15:09:40.312-0500 [rsMgr] replSet relinquishing primary state
2014-12-18T15:09:40.313-0500 [rsMgr] replSet SECONDARY
2014-12-18T15:09:40.313-0500 [rsMgr] replSet closing client sockets after relinquishing primary
2014-12-18T15:09:40.313-0500 [conn8] replSet error : logOp() but not primary
2014-12-18T15:09:40.313-0500 [conn8] test.foo Fatal Assertion 17405
2014-12-18T15:09:40.315-0500 [initandlisten] connection accepted from 127.0.0.1:38535 #12 (4 connections now open)
2014-12-18T15:09:40.324-0500 [conn8] test.foo 0x11e9b11 0x118b849 0x116e37d 0xe5b1aa 0xe53a0e 0x912498 0x7c5936 0x908683 0x90f197 0x9168f3 0xa2939a 0xa2a7e2 0xa2c9a6 0xd5f83a 0xba1052 0xba2630 0x770d5f 0x119f93e 0x7f7a72aeb182 0x7f7a71df030d
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11e9b11]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x118b849]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo13fassertFailedEi+0xcd) [0x116e37d]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod() [0xe5b1aa]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo5logOpEPKcS1_RKNS_7BSONObjEPS2_PbbPS3_+0xee) [0xe53a0e]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner3FunclERNS_27DBClientCursorBatchIteratorE+0x3e8) [0x912498]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo18DBClientConnection5queryEN5boost8functionIFvRNS_27DBClientCursorBatchIteratorEEEERKSsNS_5QueryEPKNS_7BSONObjEi+0x2b6) [0x7c5936]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner4copyERNS_6Client7ContextEPKcS5_bbbbbbNS_5QueryE+0x3a3) [0x908683]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner14copyCollectionERKSsRKNS_7BSONObjERSsbbbb+0x9f7) [0x90f197]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo18CmdCloneCollection3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x953) [0x9168f3]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2939a]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1042) [0xa2a7e2]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa2c9a6]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x230a) [0xd5f83a]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod() [0xba1052]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x580) [0xba2630]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x770d5f]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4ee) [0x119f93e]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f7a72aeb182]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f7a71df030d]
2014-12-18T15:09:40.324-0500 [conn8]
 
***aborting after fassert() failure
 
 
2014-12-18T15:09:40.332-0500 [conn8] SEVERE: Got signal: 6 (Aborted).
Backtrace:0x11e9b11 0x11e8eee 0x7f7a71d2bff0 0x7f7a71d2bf79 0x7f7a71d2f388 0x116e3ea 0xe5b1aa 0xe53a0e 0x912498 0x7c5936 0x908683 0x90f197 0x9168f3 0xa2939a 0xa2a7e2 0xa2c9a6 0xd5f83a 0xba1052 0xba2630 0x770d5f
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11e9b11]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod() [0x11e8eee]
 /lib/x86_64-linux-gnu/libc.so.6(+0x36ff0) [0x7f7a71d2bff0]
 /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39) [0x7f7a71d2bf79]
 /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f7a71d2f388]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo13fassertFailedEi+0x13a) [0x116e3ea]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod() [0xe5b1aa]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo5logOpEPKcS1_RKNS_7BSONObjEPS2_PbbPS3_+0xee) [0xe53a0e]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner3FunclERNS_27DBClientCursorBatchIteratorE+0x3e8) [0x912498]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo18DBClientConnection5queryEN5boost8functionIFvRNS_27DBClientCursorBatchIteratorEEEERKSsNS_5QueryEPKNS_7BSONObjEi+0x2b6) [0x7c5936]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner4copyERNS_6Client7ContextEPKcS5_bbbbbbNS_5QueryE+0x3a3) [0x908683]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo6Cloner14copyCollectionERKSsRKNS_7BSONObjERSsbbbb+0x9f7) [0x90f197]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo18CmdCloneCollection3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x953) [0x9168f3]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2939a]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1042) [0xa2a7e2]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa2c9a6]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x230a) [0xd5f83a]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod() [0xba1052]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x580) [0xba2630]
 /home/rassi/mongodb-linux-x86_64-2.6.5/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x770d5f]

Generated at Thu Feb 08 03:41:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.