[SERVER-31663] Inconsistent query results between primary and secondary Created: 20/Oct/17  Updated: 14/Nov/17  Resolved: 20/Oct/17

Status: Closed
Project: Core Server
Component/s: Querying, Sharding
Affects Version/s: 3.2.17
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Hailin Hu Assignee: Mark Agarunov
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
mongodb v3.2.16, v3.2.17


Issue Links:
Duplicate
duplicates SERVER-5931 Secondary reads in sharded clusters n... Closed
Operating System: ALL
Steps To Reproduce:

Since we don't know the root of the issue, we can not reproduce.
We sum values for an attribute in document. The result from secondary is 2-5% larger than the one from primary. We suppose that means 2-5% chunks have similar problem.

Participants:

 Description   

We have a cluster built with 3 shards.

We got different query results between using primary and secondary.

mongos> db.getMongo().setReadPref('primary')
mongos> db.stats_ads.find({   "start_time": ISODate("2017-10-17T15:00:00Z"),   "campaign_id": 1502,   "creative_id": "Gh5BZVRZ",   "placement_id": "6sbp2O-2" })
{ "_id" : ObjectId("59e61b4da332ce97955561a4"), "adgroup_id" : "6sGebf9Y", "creative_id" : "Gh5BZVRZ", "granularity" : "DAY", "placement_id" : "6sbp2O-2", "start_time" : ISODate("2017-10-17T15:00:00Z"), "advertiser_account_id" : 1694, "advertiser_id" : 7, "campaign_id" : 1502, "publisher_account_id" : 66790, "publisher_id" : 1969, "media_id" : 122787, "metrics" : { "view" : 37 } }

mongos> db.getMongo().setReadPref('secondaryPreferred')
mongos> db.stats_ads.find({   "start_time": ISODate("2017-10-17T15:00:00Z"),   "campaign_id": 1502,   "creative_id": "Gh5BZVRZ",   "placement_id": "6sbp2O-2" })
{ "_id" : ObjectId("59e61b4da332ce97955561a4"), "adgroup_id" : "6sGebf9Y", "creative_id" : "Gh5BZVRZ", "granularity" : "DAY", "placement_id" : "6sbp2O-2", "start_time" : ISODate("2017-10-17T15:00:00Z"), "advertiser_account_id" : 1694, "advertiser_id" : 7, "campaign_id" : 1502, "publisher_account_id" : 66790, "publisher_id" : 1969, "media_id" : 122787, "metrics" : { "view" : 17 } }
{ "_id" : ObjectId("59e61b4da332ce97955561a4"), "adgroup_id" : "6sGebf9Y", "creative_id" : "Gh5BZVRZ", "granularity" : "DAY", "placement_id" : "6sbp2O-2", "start_time" : ISODate("2017-10-17T15:00:00Z"), "advertiser_account_id" : 1694, "advertiser_id" : 7, "campaign_id" : 1502, "publisher_account_id" : 66790, "publisher_id" : 1969, "media_id" : 122787, "metrics" : { "view" : 37 } }

It's very weird, 2 documents with the same ObjectId. So I queried every single replica set.

rs0:PRIMARY> db.stats_ads.find({   "start_time": ISODate("2017-10-17T15:00:00Z"),   "campaign_id": 1502,   "creative_id": "Gh5BZVRZ",   "placement_id": "6sbp2O-2" })
(no results)

rs1:PRIMARY> db.stats_ads.find({   "start_time": ISODate("2017-10-17T15:00:00Z"),   "campaign_id": 1502,   "creative_id": "Gh5BZVRZ",   "placement_id": "6sbp2O-2" })
{ "_id" : ObjectId("59e61b4da332ce97955561a4"), "adgroup_id" : "6sGebf9Y", "creative_id" : "Gh5BZVRZ", "granularity" : "DAY", "placement_id" : "6sbp2O-2", "start_time" : ISODate("2017-10-17T15:00:00Z"), "advertiser_account_id" : 1694, "advertiser_id" : 7, "campaign_id" : 1502, "publisher_account_id" : 66790, "publisher_id" : 1969, "media_id" : 122787, "metrics" : { "view" : 17 } }

rs2:PRIMARY> db.stats_ads.find({   "start_time": ISODate("2017-10-17T15:00:00Z"),   "campaign_id": 1502,   "creative_id": "Gh5BZVRZ",   "placement_id": "6sbp2O-2" })
{ "_id" : ObjectId("59e61b4da332ce97955561a4"), "adgroup_id" : "6sGebf9Y", "creative_id" : "Gh5BZVRZ", "granularity" : "DAY", "placement_id" : "6sbp2O-2", "start_time" : ISODate("2017-10-17T15:00:00Z"), "advertiser_account_id" : 1694, "advertiser_id" : 7, "campaign_id" : 1502, "publisher_account_id" : 66790, "publisher_id" : 1969, "media_id" : 122787, "metrics" : { "view" : 37 } }

Both rs1 and rs2 have the same (I mean the same ObjectId) document, but have different non-key attributes values.

I guess it is a result of chunk migration. The new one has been moved/created, but the old one has not been deleted. Or, they have their own version number, which not work under secondaryPreferred read preference.
Inconsistent query results between primary and secondary is a critical issue for us. Any suggestion?



 Comments   
Comment by Mark Agarunov [ 20/Oct/17 ]

Hello h@bulbit.jp,

Thank you for the report. Looking over the description and output you've provided, I believe this is due to the behavior detailed in SERVER-5931, where secondary reads in a sharded cluster can return orphaned or duplicated documents. Fortunately this is marked as fixed in the upcoming MongoDB 3.6 release. As this is the same underlying issue, I've closed this ticket as a duplicate. Please see SERVER-5931 for additional information.

Thanks,
Mark

Generated at Thu Feb 08 04:27:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.