[SERVER-12357] duplicate key error -- crashes whole replicaset Created: 14/Jan/14  Updated: 10/Dec/14  Resolved: 02/Jun/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: stiwan chinazki Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MongoDB 2.4.8
Ubuntu 12.04 LTS, 64bit
32GB RAM
3 equal machines in LAN running a replicaset


Issue Links:
Duplicate
is duplicated by SERVER-13219 Fatal Assertion 16360 Closed
is duplicated by SERVER-11638 Primary crashes because of duplicate ... Closed
Operating System: Linux
Participants:

 Description   

So, i got this error in my log and not only one node crashes, but all nodes in the replicaset.
At that time i was writing some documents (up to 1 mio)

Tue Jan 14 09:40:30.942 [repl writer worker 3] ERROR: writer worker caught exception: E11000 duplicate key error index: competition.competition.$listingid_1  dup key: { : "2kKN9CzZ" } on: { ts: Timestamp 1389688830000|338, h: -3327769536912367897, v: 2, op: "i", ns: "competition.competition", o: { _id: ObjectId('52d4f7fe810b54b66ced39a2'), listingid: "2kKN9CzZ", results: [ { somefields: "foobar", bla": "bla" } ], lastchanged: 1389688830 } }
Tue Jan 14 09:40:30.942 [repl writer worker 3]   Fatal Assertion 16360
0xde05e1 0xda03d3 0xc28f3c 0xdadf21 0xe28e69 0x7f8d283cfe9a 0x7f8d276e23fd
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde05e1]
 /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xda03d3]
 /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc28f3c]
 /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdadf21]
 /usr/bin/mongod() [0xe28e69]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f8d283cfe9a]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f8d276e23fd]
Tue Jan 14 09:40:30.990 [repl writer worker 3]
 
***aborting after fassert() failure
 
 
Tue Jan 14 09:40:30.994 Got signal: 6 (Aborted).
 
Tue Jan 14 09:40:30.996 Backtrace:
0xde05e1 0x6d0559 0x7f8d276244a0 0x7f8d27624425 0x7f8d27627b8b 0xda040e 0xc28f3c 0xdadf21 0xe28e69 0x7f8d283cfe9a 0x7f8d276e23fd
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde05e1]
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6d0559]
 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f8d276244a0]
 /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7f8d27624425]
 /lib/x86_64-linux-gnu/libc.so.6(abort+0x17b) [0x7f8d27627b8b]
 /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xda040e]
 /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc28f3c]
 /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdadf21]
 /usr/bin/mongod() [0xe28e69]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f8d283cfe9a]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f8d276e23fd]
 
 
 
***** SERVER RESTARTED *****
 
 
Tue Jan 14 09:43:57.506 [initandlisten] MongoDB starting : pid=17961 port=27017 dbpath=/db/mongodb/ 64-bit host=REMOVED
Tue Jan 14 09:43:57.506 [initandlisten] db version v2.4.8
Tue Jan 14 09:43:57.506 [initandlisten] git version: a350fc38922fbda2cec8d5dd842237b904eafc14
Tue Jan 14 09:43:57.506 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_49
Tue Jan 14 09:43:57.506 [initandlisten] allocator: tcmalloc
Tue Jan 14 09:43:57.506 [initandlisten] options: { bind_ip: "REMOVED", config: "/etc/mongodb.conf", dbpath: "/db/mongodb/", fork: "true", logappend: "true", logpath: "/var/log/mongodb/mongodb.log", nojournal: "true", port: 27017, replSet: "rs0" }
**************
Unclean shutdown detected.
Please visit http://dochub.mongodb.org/core/repair for recovery instructions.
*************
Tue Jan 14 09:43:57.518 [initandlisten] exception in initAndListen: 12596 old lock file, terminating
Tue Jan 14 09:43:57.518 dbexit:
Tue Jan 14 09:43:57.518 [initandlisten] shutdown: going to close listening sockets...
Tue Jan 14 09:43:57.518 [initandlisten] shutdown: going to flush diaglog...
Tue Jan 14 09:43:57.518 [initandlisten] shutdown: going to close sockets...
Tue Jan 14 09:43:57.518 [initandlisten] shutdown: waiting for fs preallocator...
Tue Jan 14 09:43:57.518 [initandlisten] shutdown: closing all files...
Tue Jan 14 09:43:57.518 [initandlisten] closeAllFiles() finished
Tue Jan 14 09:43:57.518 dbexit: really exiting now
 
 
***** SERVER RESTARTED *****



 Comments   
Comment by Ramon Fernandez Marina [ 02/Jun/14 ]

stiwan, thanks for letting us know. I'm happy to hear the 2.4.10 is working well for you, and I'm going to close this ticket as "gone away". If you run into any other issues please feel free to open new tickets.

Regards,
Ramón.

Comment by stiwan chinazki [ 31/May/14 ]

Hi Ramon,

This crash occured only one time, but i thought it was a good idea to report the crash log.
I'm running 2.4.10 now and i don't have any crashes with that.

Comment by Ramon Fernandez Marina [ 30/May/14 ]

Have you tried upgrading to a later version? There have been many fixes in the replication code since 2.4.8, so we'd recommend an upgrade to the latest stable version (2.4.10 is the latest in the 2.4 series).

Note that if the secondaries continue to fail, with or without upgrading, you may need to perform an initial sync of the failing secondaries.

Can you please let us know if this continues to be an issue, and whether the re-sync/upgrade helps?

Comment by stiwan chinazki [ 14/Jan/14 ]

Could you please remove IP & Hostname from the log? Thank you.

Generated at Thu Feb 08 03:28:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.