[SERVER-14809] crash in replicaset when running resync while there's no primary Created: 06/Aug/14  Updated: 19/Nov/14  Resolved: 04/Nov/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.6.3
Fix Version/s: 2.8.0-rc0

Type: Bug Priority: Major - P3
Reporter: Ramon Fernandez Marina Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-15931 Repeated "[ReplicationExecutor] could... Closed
is related to SERVER-6552 make resync command work with replica... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Initialize a replica set, and run the following in the primary:

db.getSiblingDB("admin").runCommand( { replSetStepDown: 120 , force: true })
db.getSiblingDB("admin").runCommand({resync:1})
 

Participants:

 Description   

In a replica set, if I force a stepdown of the primary and immediately run resync, the node in question segfaults:

2014-08-06T16:03:49.877-0400 [rsSync] oplog sync 1 of 3
2014-08-06T16:03:49.877-0400 [rsSync] SEVERE: Invalid access at address: 0x2900676ac6
2014-08-06T16:03:49.881-0400 [rsSync] SEVERE: Got signal: 11 (Segmentation fault: 11).
Backtrace:0x1006c012b 0x1006bfe8f 0x1006bff92 0x7fff8e8cd5aa 0x1333c42a0 0x100476d8c 0x1004aaf78 0x10049a9d3 0x10049b86f 0x10049c931 0x1004af95e 0x1004afa6e 0x1004afdb8 0x1006f4575 0x7fff88328899 0x7fff8832872a 0x7fff8832cfc9 
 0   mongod                              0x00000001006c012b _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x00000001006bfe8f _ZN5mongo12_GLOBAL__N_110abruptQuitEi + 191
 2   mongod                              0x00000001006bff92 _ZN5mongo12_GLOBAL__N_124abruptQuitWithAddrSignalEiP9__siginfoPv + 210
 3   libsystem_platform.dylib            0x00007fff8e8cd5aa _sigtramp + 26
 4   ???                                 0x00000001333c42a0 0x0 + 5154554528
 5   mongod                              0x0000000100476d8c _ZN5mongo11_logOpObjRSERKNS_7BSONObjE + 620
 6   mongod                              0x00000001004aaf78 _ZN5mongo7replset11InitialSync16oplogApplicationERKNS_7BSONObjES4_ + 242
 7   mongod                              0x000000010049a9d3 _ZN5mongo11ReplSetImpl30_syncDoInitialSync_applyToHeadERNS_7replset8SyncTailEPNS_11OplogReaderEPKNS_6MemberERKNS_7BSONObjERS9_ + 1425
 8   mongod                              0x000000010049b86f _ZN5mongo11ReplSetImpl18_syncDoInitialSyncEv + 1891
 9   mongod                              0x000000010049c931 _ZN5mongo11ReplSetImpl17syncDoInitialSyncEv + 703
 10  mongod                              0x00000001004af95e _ZN5mongo11ReplSetImpl11_syncThreadEv + 292
 11  mongod                              0x00000001004afa6e _ZN5mongo11ReplSetImpl10syncThreadEv + 218
 12  mongod                              0x00000001004afdb8 _ZN5mongo15startSyncThreadEv + 168
 13  mongod                              0x00000001006f4575 thread_proxy + 229
 14  libsystem_pthread.dylib             0x00007fff88328899 _pthread_body + 138
 15  libsystem_pthread.dylib             0x00007fff8832872a _pthread_struct_init + 0
 16  libsystem_pthread.dylib             0x00007fff8832cfc9 thread_start + 13



 Comments   
Comment by Spencer Brody (Inactive) [ 31/Oct/14 ]

I can't repro this crash on 2.6.5, so I can't verify if 2.7.9 has fixed it...

Generated at Thu Feb 08 03:36:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.