[SERVER-13319] Secondary collection contains more objects than primary Created: 23/Mar/14  Updated: 05/May/14  Resolved: 03/Apr/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.9
Fix Version/s: None

Type: Question Priority: Trivial - P5
Reporter: Lucien van Wouw Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Using mongodb 2.4.9 ( database created since 1.8.3 ).

I read an anomaly in the object count between a replicaset's primary and secondary collection; even though the secondary is not behind the Primary.

Comparing with
$ db.master.chunks.stats()

parameters Primary Secondary 1 Secondary 2
count 6087983 6087976 6087983

Secondary 1 has 7 document more than the Primary and Secondary 2.
A reindex operation does not change that difference.

An export of all _id fields from secondary 1 produces a list that matches the number of expected objects: 6087983

What are these 7 extra objects about ? System related documents maybe ? Damaged documuments ? How do I find out ?



 Comments   
Comment by Thomas Rueckstiess [ 03/Apr/14 ]

Hi Lucien,

Thanks for letting us know. Agreed, without the logs it will be near impossible to diagnose. As requested we're going to close the issue now. If the problem should re-occur, please don't hesitate to let us know.

Regards,
Thomas

Comment by Lucien van Wouw [ 26/Mar/14 ]

I traced the missing fsGrid:chunks. All occured long before the events I described above. Using ObjectId(_id).getTimestamp():
ISODate("2012-12-14T11:48:37Z")
ISODate("2012-12-14T11:48:37Z")
ISODate("2012-12-15T16:07:13Z")
ISODate("2012-12-15T17:36:12Z")
ISODate("2012-12-17T00:21:52Z")
ISODate("2012-12-18T13:56:33Z")

A test:
GETting the files these chunks belong to from the Primary gives the master file intact, md5 and all. Primary is good; and I'll resync.

Comment by Lucien van Wouw [ 25/Mar/14 ]

I will resync

The other 14 shards show no similar anomalies. Without any concrete log for that past I do not think this issue can be investigated more closer, unless you have a suggestion for diagnostics. Otherwise, I like to ask you to close this issue.

Thanks

Comment by Asya Kamsky [ 24/Mar/14 ]

I'm not sure if the sequence of events explains the discrepancy, but I would agree that resyncing the secondary is the safest course of action.

Comment by Lucien van Wouw [ 24/Mar/14 ]

The table values are correct, but the description is wrong... secondary 1 has seven objects less: 6087976. The exported _ids from secondary 1 match that number 6087976. That is ok.

On the 2nd of April 2013 secondary 1 had roll back data for this collection after severe network problems with a switch. There is no log data of that time; but the rollback data was not used and removed. Secondary 1 showed no error after validating all of its collections. The stats were not compared with the Primary.

The writeConcern was set to Majority after that. And the batch re-submitted.

Is it prudent to re-sync secondary 1 ?

Comment by Asya Kamsky [ 24/Mar/14 ]

Also the table you posted has secondary 1 with seven fewer objects, not more - can you double check those numbers please?

Comment by Asya Kamsky [ 24/Mar/14 ]

Since the collection is called .chunks is this a gridFS collection with chunks for a corresponding files collection?

Generated at Thu Feb 08 03:31:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.