[SERVER-13718] Secondary crashes on replSetUpdatePosition command Created: 24/Apr/14  Updated: 10/Dec/14  Resolved: 24/Apr/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.6.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Keith Grennan Assignee: Matt Dannenberg
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-13500 Changing replica set configuration ca... Closed
Operating System: ALL
Steps To Reproduce:

Updated to 2.6.0 recently, have a 3-node replSet running on Amazon Linux 2014.03 in EC2.
Can't reproduce, just seen it crash once so far.

Participants:

 Description   

2014-04-24T07:39:28.154+0000 [clientcursormon] mem (MB) res:2586 virt:25934
2014-04-24T07:39:28.154+0000 [clientcursormon] mapped (incl journal view):24790
2014-04-24T07:39:28.154+0000 [clientcursormon] connections:12
2014-04-24T07:39:28.154+0000 [clientcursormon] replication threads:32
2014-04-24T07:39:41.401+0000 [conn11813] end connection 172.31.27.158:36369 (11 connections now open)
2014-04-24T07:39:42.401+0000 [initandlisten] connection accepted from 172.31.27.158:36372 #11815 (12 connections now open)
2014-04-24T07:39:43.171+0000 [conn11814] end connection 172.31.25.124:51934 (11 connections now open)
2014-04-24T07:39:43.173+0000 [initandlisten] connection accepted from 172.31.25.124:51936 #11816 (12 connections now open)
2014-04-24T07:40:13.232+0000 [conn11816] end connection 172.31.25.124:51936 (11 connections now open)
2014-04-24T07:40:13.233+0000 [initandlisten] connection accepted from 172.31.25.124:51941 #11817 (12 connections now open)
2014-04-24T07:40:14.721+0000 [conn11815] end connection 172.31.27.158:36372 (11 connections now open)
2014-04-24T07:40:14.766+0000 [initandlisten] connection accepted from 172.31.27.158:36373 #11818 (12 connections now open)
2014-04-24T07:40:17.737+0000 [initandlisten] connection accepted from 172.31.27.158:36374 #11819 (13 connections now open)
2014-04-24T07:40:17.740+0000 [initandlisten] connection accepted from 172.31.27.158:36375 #11820 (14 connections now open)
2014-04-24T07:40:17.961+0000 [conn11820] command admin.$cmd command: replSetUpdatePosition { replSetUpdatePosition: 1, handshake: { handshake: ObjectId('5356154fbb51f8260687495c'), member: 7, config:

{ _id: 7, host: "ip-172-31-27-158.us-west-1.compute.internal:27017" }

} } ntoreturn:1 keyUpdates:0 numYields:0 reslen:37 220ms
2014-04-24T07:40:17.962+0000 [conn11820] replset couldn't find a slave with id 0, not tracking 5327dc6fb079ebf3069b005a
2014-04-24T07:40:18.006+0000 [SyncSourceFeedbackThread] SEVERE: Invalid access at address: 0xa8
2014-04-24T07:40:18.087+0000 [SyncSourceFeedbackThread] SEVERE: Got signal: 11 (Segmentation fault).
Backtrace:0x11bd301 0x11bc6de 0x11bc7cf 0x7eff2605b5b0 0xeacaf6 0xeb19e8 0x1145332 0x1201c99 0x7eff26053f18 0x7eff25369e0d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11bd301]
/usr/bin/mongod() [0x11bc6de]
/usr/bin/mongod() [0x11bc7cf]
/lib64/libpthread.so.0(+0xf5b0) [0x7eff2605b5b0]
/usr/bin/mongod(_ZN5mongo18SyncSourceFeedback13replHandshakeEv+0xb86) [0xeacaf6]
/usr/bin/mongod(_ZN5mongo18SyncSourceFeedback3runEv+0x9b8) [0xeb19e8]
/usr/bin/mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0xd2) [0x1145332]
/usr/bin/mongod() [0x1201c99]
/lib64/libpthread.so.0(+0x7f18) [0x7eff26053f18]
/lib64/libc.so.6(clone+0x6d) [0x7eff25369e0d]



 Comments   
Comment by Keith Grennan [ 24/Apr/14 ]

OK, I'll wait for 2.6.1.. thanks!

Comment by Matt Dannenberg [ 24/Apr/14 ]

I've confirmed that this is a duplicate of SERVER-13500, which has been fixed and is in 2.6.1-rc0. Take a look at SERVER-13500 for more details about the problem, what impact it may have, and how it's been fixed.

Thanks,
Matt

Generated at Thu Feb 08 03:32:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.