-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.11
-
Component/s: Index Maintenance
-
Labels:None
-
ALL
I design an academic distributed application in which I have a program who streams and collects tweets (via Twitter Streaming API), in particular profiles (authors informations) on a dedicated collection in my MongoDB database.
In this collection, I have a unique index applied on 2 fields.
My distributed application works with Apache Camel framework and with RabbitMQ server. When I set a number of consumers > 1 behind my streamer, I get duplicates in my collection. More precisely, for each duplicate entry, I have an incomplete entry (with numerous missing fields) and a complete entry.
If I drop and try to re-create unique index, I get an error, saying duplicates are present in collection.
I think it is a concurrent access problem, since collected date of each duplicate entries are very close.
I give below an example of duplicates, with current applied indexes on the collection:
> db.profiles.find({"account_id" : "761902690985127936"}) { "_id" : ObjectId("5b14e02f9ae95e0e31793a93"), "broadcaster" : "Twitter", "account_type" : "account", "account_id" : "761902690985127936", "collected_date" : ISODate("2018-06-04T06:46:07.361Z") } { "_id" : ObjectId("5b14e02f9ae95e0e31793a94"), "broadcaster" : "Twitter", "account_type" : "account", "account_id" : "761902690985127936", "collected_date" : ISODate("2018-06-04T06:46:07.361Z"), "user_id" : "761902690985127936", "lang" : "eng", "location" : "Kuala Lumpur City", "user_name" : "Kakajan Haytlyyev #FBRParty", "user_account" : "kkjn1966", "utc_offset" : "GMT+0:00", "profile_link" : "https://twitter.com/kkjn1966", "account_created_at" : NumberLong("1470486732000"), "description" : "SAY WHAT YOU MEAN AND MEAN WHAT YOU SAY. \nPolitical Correctness Is Not Allowed\n#Resistance #ResistanceUnited #StrongerTogether", "is_verified" : false, "geo_enabled" : false, "profile_image_url" : "http://pbs.twimg.com/profile_images/992869258337042432/xYbRXlgO.jpg", "profile_background_image_url" : "http://abs.twimg.com/images/themes/theme1/bg.png", "followers_count" : 2809, "friends_count" : 2752, "listed_count" : 3, "statuses_count" : 19472, "is_contributor_enabled" : false, "is_translator" : false, "is_protected" : false } > db.profiles db.profiles > db.profiles.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "documents.profiles" }, { "v" : 1, "unique" : true, "key" : { "broadcaster" : 1, "account_id" : 1, "_id" : -1 }, "name" : "app_key", "ns" : "documents.profiles" }, { "v" : 1, "key" : { "broadcaster" : 1, "user_account" : 1, "_id" : -1 }, "name" : "broadcaster_1_user_account_1__id_-1", "ns" : "documents.profiles" } ] >