[SERVER-8636] Broken indexes Created: 20/Feb/13  Updated: 10/Dec/14  Resolved: 30/Jul/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.2.3
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: Aristarkh Zagorodnikov Assignee: Joanna Cheng
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File d1.s2.log.gz     File d2.s2.log.gz     File d3.s2.log.gz    
Operating System: ALL
Participants:

 Description   

We are having some problems with sharding, when diagnosing them I stumbled upon this:

$ mongo cancer:27022/a
MongoDB shell version: 2.2.3
connecting to: cancer:27022/a
driveFS-2:SECONDARY> rs.slaveOk()
driveFS-2:SECONDARY> db.fs.chunks.find({files_id:NumberLong("5444002714197426176")},{data:0})
{ "_id" : ObjectId("51252590f2ac490dc4e32ec8"), "files_id" : NumberLong("5444002714197426176"), "n" : 0 }
{ "_id" : ObjectId("51252597de30b908b0d23b69"), "files_id" : NumberLong("5444002714197426176"), "n" : 0 }
driveFS-2:SECONDARY> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "a.fs.chunks", "name" : "_id_" }
{ "v" : 1, "key" : { "files_id" : 1, "n" : 1 }, "unique" : true, "ns" : "a.fs.chunks", "name" : "files_id_1_n_1" }

How this is possible?



 Comments   
Comment by Joanna Cheng [ 30/Jul/13 ]

Hi Aristarkh,

I reviewed your logs and have some comments:

1. I can see the problems you first hit at ~23:35, which causes the secondaries to crash. Also, as the replica set members come back up, we try to catch up but hit the same problematic transaction again.

2. I don't see the write actually going through in your logs. I wouldn't expect the write to go through on the secondaries while the unique index exists.

3. I can see the index being rebuilt around 01:40 on d3 (which is the primary at this stage?). Not sure what this was for. I see more errors (for the same duplicate key) as late as 08:44 on member d2

4. Since there are only logs from the incident (but not earlier), I can't ascertain how the index inconsistency among replica set members happened in the first place.
4a. Certainly there is no requirement for indexes to be the same across replica set members (i.e. you may want one member with an index for a certain batch reporting job that runs, but on no other members).

As you have moved past this issue and probably no longer have any of the data from the incident, there is not much else we can advise without resorting to pure guesswork. I will resolve this ticket; however if you have any further information related to this issue, please feel free to re-open with additional details.

Thanks and kind regards,
Joanna

Comment by Aristarkh Zagorodnikov [ 06/May/13 ]

Hi!
We dropped the indexes and recreated them manually, dropping duplicates, this fixed the problem. I was more interested in how this happened in the first place and thought you would also be interested. So, yes, we solved the problem, bit without knowledge of what caused the issue itself we're a bit nervous about such problems arising in the future, and as dataset grows (7TiB now), reindexing might become more and more problematic as a resolution.

Comment by Ian Daniel [ 06/May/13 ]

Hi Aristarkh,

It has been a couple of months since you raised this issue. What is the status of the problem now? Have you been able to resolve the problem, or are you still looking for assistance?

Kind regards,
Ian

Comment by Aristarkh Zagorodnikov [ 21/Feb/13 ]

The offending insert happened at about Feb 20 23:35 log time.

Comment by Aristarkh Zagorodnikov [ 21/Feb/13 ]

Attached logs. d3 is usually the primary (has priority 10), d1 and d2 are secondaries. d3 had non-unique index, d1 and d2 had unique one. Insert got into d3, d1 and d2 at first refused to replicate the change (and that's fine), but after a series of restarts (puppet ensures the mongods are up and running periodically) without any user intervention (no one altered any data on d1 and d2, and since replicaSet was degraded due to only 1 member up, d3 was also secondary, so it wasn't accepting any writes) the insert got through.

Comment by Aristarkh Zagorodnikov [ 21/Feb/13 ]

Will attach logs later today.

Comment by Aristarkh Zagorodnikov [ 21/Feb/13 ]

Yes, the index situation is as described. It has been primary this for at least a month (might be demoted for a while for a reboot).

Comment by Eliot Horowitz (Inactive) [ 21/Feb/13 ]

So to be clear, the current primary has an index on ( files_id , n ) but it is non-unique?
How long has that node been the primary?
Can you send the logs from all nodes.

Comment by Aristarkh Zagorodnikov [ 20/Feb/13 ]

The only "legal" way it could happen (as I see it, I don't know enough about replication internals) is that batch application ignores index uniqueness when inserting successive batched data which violates uniqueness constraint. Otherwise, it looks like corruption to me (and since it happened on four different raid arrays it's a MongoDB error, not a hardware one).

Comment by Aristarkh Zagorodnikov [ 20/Feb/13 ]

P.S. We diagnosed this error to originate from non-unique index on the primary, but secondaries have the unique index, how such inserts are possible?

Comment by Aristarkh Zagorodnikov [ 20/Feb/13 ]

We were getting this after a series of errors like this one:

Wed Feb 20 23:35:49 [repl writer worker 1] ERROR: writer worker caught exception: E11000 duplicate key error index: a.fs.chunks.$files_id_1_n_1  dup key: { : 5444002714197426176, : 0 } on: { ts: Timestamp 1361388949000|2, h: -4785766145296858235, v: 2, op: "i", ns: "a.fs.chunks", o: { _id: ObjectId('51252597de30b908b0d23b69'), files_id: 5444002714197426176, n: 0, data: BinData } }

These errors eventually gone away after a series of secondary restarts (I cannot even begin to understand how this error could go away at all).

Different machine, different replicaset:

driveFS-4:SECONDARY> db.fs.chunks.find({files_id:NumberLong("-8391055341084737536")},{data:0})
{ "_id" : ObjectId("5125258bf2ac490dc4e32eb9"), "files_id" : NumberLong("-8391055341084737536"), "n" : 0 }
{ "_id" : ObjectId("5125258ede30b908b0d23b66"), "files_id" : NumberLong("-8391055341084737536"), "n" : 0 }
driveFS-4:SECONDARY> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "a.fs.chunks", "name" : "_id_" }
{ "v" : 1, "key" : { "files_id" : 1, "n" : 1 }, "unique" : true, "ns" : "a.fs.chunks", "name" : "files_id_1_n_1" }

Generated at Thu Feb 08 03:17:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.