Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Cannot Reproduce
Priority: Critical - P2
Fix Version/s: None
Affects Version/s: 2.4.6
Component/s: Replication
Labels:
None
Environment:
ubuntu 12.04 on aws

Operating System:
Linux
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We have a cluster of 3 nodes to enable replication. Two of the nodes are configured with the same priority (1) while the third one is configured with priority 0 so it never gets promoted (it's the one we use for backups).

As of last night, the two secondaries have been failing continuously during writes, with the following error logged on both servers:

Wed Sep 25 05:28:16.211 [repl writer worker 2] ERROR: writer worker caught exception: E11000 duplicate key error index: marketshare.Application.$name_1 dup key: { : "TestingRDS" } on: { ts: Timestamp 1380112101000|1, h: -5880003146554201345, v: 2, op: "u", ns: "marketshare.Application", o2:

{ _id: ObjectId('5242be375f6ffb0c37145385') }

, o: { $set: { boxes.0:

{ box_type_name: "OPTIMIZERMS-APPV4", updated: "2013-09-25 12:28:22.855168", <rest of object>...}

}}

Wed Sep 25 05:28:16.211 [repl writer worker 2] Fatal Assertion 16360
0xdddd81 0xd9dc13 0xc26bfc 0xdab721 0xe26609 0x7ff9f4923e9a 0x7ff9f3c36ccd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdddd81]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xd9dc13]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc26bfc]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdab721]
/usr/bin/mongod() [0xe26609]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7ff9f4923e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ff9f3c36ccd]
Wed Sep 25 05:28:16.215 [repl writer worker 2]

***aborting after fassert() failure

Wed Sep 25 05:28:16.215 Got signal: 6 (Aborted).

Wed Sep 25 05:28:16.219 Backtrace:
0xdddd81 0x6d0d29 0x7ff9f3b794a0 0x7ff9f3b79425 0x7ff9f3b7cb8b 0xd9dc4e 0xc26bfc 0xdab721 0xe26609 0x7ff9f4923e9a 0x7ff9f3c36ccd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdddd81]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6d0d29]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7ff9f3b794a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7ff9f3b79425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b) [0x7ff9f3b7cb8b]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xd9dc4e]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc26bfc]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdab721]
/usr/bin/mongod() [0xe26609]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7ff9f4923e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ff9f3c36ccd]

After rebooting both secondaries, the cluster was able to establish quorum again for about 15 minutes, but after that, the issue reproed again. When we logged in to the instances directly, we noticed that que indexes were correct on the primary, but were completely empty on the secondary.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mongodb-primary.log
24.93 MB
Sep 25 2013 06:17:55 PM UTC
mongodb-secondary-1.log
24.90 MB
Sep 25 2013 06:51:37 PM UTC
mongodb-secondary-2.log
24.71 MB
Sep 25 2013 06:51:37 PM UTC

Assignee:: Samantha Ritter (Inactive)
Reporter:: Ramiro Berrelleza
Participants:: Eliot Horowitz, Ramiro Berrelleza, Samantha Ritter, Stennie Steneker
Votes:: 1 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Sep 25 2013 06:17:55 PM UTC
Updated:: Dec 10 2014 11:18:58 PM UTC
Resolved:: Mar 18 2014 11:25:02 AM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates