Details
-
Bug
-
Resolution: Works as Designed
-
Major - P3
-
None
-
3.2.11
-
None
-
ALL
Description
I design an academic distributed application in which I have a program who streams and collects tweets (via Twitter Streaming API), in particular profiles (authors informations) on a dedicated collection in my MongoDB database.
In this collection, I have a unique index applied on 2 fields.
My distributed application works with Apache Camel framework and with RabbitMQ server. When I set a number of consumers > 1 behind my streamer, I get duplicates in my collection. More precisely, for each duplicate entry, I have an incomplete entry (with numerous missing fields) and a complete entry.
If I drop and try to re-create unique index, I get an error, saying duplicates are present in collection.
I think it is a concurrent access problem, since collected date of each duplicate entries are very close.
I give below an example of duplicates, with current applied indexes on the collection:
> db.profiles.find({"account_id" : "761902690985127936"})
|
{ "_id" : ObjectId("5b14e02f9ae95e0e31793a93"), "broadcaster" : "Twitter", "account_type" : "account", "account_id" : "761902690985127936", "collected_date" : ISODate("2018-06-04T06:46:07.361Z") }
|
{ "_id" : ObjectId("5b14e02f9ae95e0e31793a94"), "broadcaster" : "Twitter", "account_type" : "account", "account_id" : "761902690985127936", "collected_date" : ISODate("2018-06-04T06:46:07.361Z"), "user_id" : "761902690985127936", "lang" : "eng", "location" : "Kuala Lumpur City", "user_name" : "Kakajan Haytlyyev #FBRParty", "user_account" : "kkjn1966", "utc_offset" : "GMT+0:00", "profile_link" : "https://twitter.com/kkjn1966", "account_created_at" : NumberLong("1470486732000"), "description" : "SAY WHAT YOU MEAN AND MEAN WHAT YOU SAY. \nPolitical Correctness Is Not Allowed\n#Resistance #ResistanceUnited #StrongerTogether", "is_verified" : false, "geo_enabled" : false, "profile_image_url" : "http://pbs.twimg.com/profile_images/992869258337042432/xYbRXlgO.jpg", "profile_background_image_url" : "http://abs.twimg.com/images/themes/theme1/bg.png", "followers_count" : 2809, "friends_count" : 2752, "listed_count" : 3, "statuses_count" : 19472, "is_contributor_enabled" : false, "is_translator" : false, "is_protected" : false }
|
> db.profiles
|
db.profiles
|
> db.profiles.getIndexes()
|
[
|
{
|
"v" : 1,
|
"key" : {
|
"_id" : 1
|
},
|
"name" : "_id_",
|
"ns" : "documents.profiles"
|
},
|
{
|
"v" : 1,
|
"unique" : true,
|
"key" : {
|
"broadcaster" : 1,
|
"account_id" : 1,
|
"_id" : -1
|
},
|
"name" : "app_key",
|
"ns" : "documents.profiles"
|
},
|
{
|
"v" : 1,
|
"key" : {
|
"broadcaster" : 1,
|
"user_account" : 1,
|
"_id" : -1
|
},
|
"name" : "broadcaster_1_user_account_1__id_-1",
|
"ns" : "documents.profiles"
|
}
|
]
|
>
|