[SERVER-30708] _id index returning more than one document with same _id in aggregations and counts. Created: 17/Aug/17 Updated: 23/Jul/18 Resolved: 20/Jun/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Sharding |
| Affects Version/s: | 3.4.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Lucas | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 14.04 |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | I don't know. This was found by one of my employees and I don't know how to explain to my customers or even to my colleagues. |
||||||||||||
| Participants: | |||||||||||||
| Description |
|
I've found a strange behavior in one of my sharded clusters. I will resume this report because I really don't know how to explain this, I just need to know what kind of data is needed to debug this behavior and help to fix this asap, because it is impacting some of my customers. Basically, I have a sharded cluster UNDER PRODUCTION with 7 nodes with replication factor of 3, 6 mongos and 3 config servers. All those nodes were created using the binary version 3.4.4 but they already passed though different versions (trying to escape of another bug): 3.4.0, 3.4.3 and 3.4.6. Currently on version 3.4.6 My shard key is composed by a hashed index on a custom string field named "pid". This field is basically one object identification but can repeat for a thousand times, for example. My database has 500 millions documents. All insertions in this database are performed by applications using java driver version 3.4.2. We had some old applications using java driver 3.2.2 many months ago, but this cluster was created just a month ago and I really don't believe those applications added any documents in this cluster. The above example is about one document added to the database with a java driver 3.4.2. Explaining the problem with commands: This find on mongo shell returns only one document:
This count, otherwise, returns two documents in mongo shell:
What is really impacting my customers is that if I execute the aggregation showed below in my application, some documents are returned twice. In one MongoDB GUI we can see the same result but in others the document doesn't returns twice (pictures aggregation_*).
|
| Comments |
| Comment by Kelsey Schubert [ 21/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, To clarify, as there are a few different issues going on here:
Starting in MongoDB 4.0, I would expect that that any count command performed with a predicate would return accurate results regardless of whether it is executed against a primary or secondary. Please let me know if you have any additional questions or if this ticket can be resolved. Thank you, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 20/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
I read on
If so, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 20/Mar/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Sorry for the delay. Yes, you are correct. We are getting wrong reads in counts/finds/aggregations on secondary (you said it is fine now in 3.6) and wrong count results on primaries. Both tracked on issues that you comment before. Thanks. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 16/Feb/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, Thank you for the additional details. The behavior you describe your latest comment is expected and tracked in In your original ticket description, you mention that aggregation results on the primary may return duplicated documents. My understanding is that you haven't observed this behavior since the original report. Is this correct? Thanks again, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 06/Feb/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
explain_primary_find.json Hello kelsey.schubert. The aggregation appears to have the wrong results only in secondary reads. We are constantly facing this issue here. We now have a customized migration window (only from 00:00 to 06:00) to minimize this problem, but our customers are complaining about wrong results in finds and aggregations when reading from secondaries. I will provide several outputs again, either for explain and sharding status. To find a query with duplicated _id documents, I executed this query:
I found many documents, but I will use this object to explain and execute queries:
I will use a 'mongos' to connect and execute all operations. Count in primary results in two documents. Find in primary results in one document. Aggregation in primary results in one document. Files uploaded:
Additional info:
| |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 26/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, Thank you for providing this information, we're tracking the issue you're observing the the counts in Thanks, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 21/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello. I found duplicated documents here and I will execute find, count and aggregation (and their respectives explains) with primary and secondary preferred. To remember: The results are in the uploaded file called "queries.7z". Will be all queries executed against the mongos with secondary and primary read preference. This time only counts returned wrong results with primary read preference. I really don't know why. Thanks | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 21/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, I'd like to investigate this behavior in more detail:
I assumed these queries were being run through the mongos, and so I would like the explain from the mongos with primary read preference. Please let me know if my assumption is mistaken. Thank you, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 19/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello kelsey.schubert, I will stop my script and I will wait for duplicated documents to execute the explain command. But just to clarify. You want me to execute the explain command agaist the primary of the "wrong" replicaset? Or against the 'mongos' but with primary read preference? Thanks. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 14/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, Would you please provide the explain output of the aggregation and find commands executed against the primary? While we expect that stale data may be returned from a sharded secondary, I would not expect the aggregation commands to return orphan documents on the primary. Thus far we haven't been able to reproduce this specific behavior, but the explain output may help us understand why these commands are providing different results. Thank you, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 06/Dec/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello, do you guys have any update for this issue? I can't add more shards to my cluster because this will cause a huge negative impact to my system. I'm thinking about a oplog reader to monitor operations of the cluster migration, and then automatically remove the duplication documents in the original shard. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 15/Nov/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello again! I'm found quite interesting news about this issue. Last week I added two new shards into my cluster. Many customers of my analytics software were receiving wrong and inconsistent data because of the duplicate data. Basically when we added those new shards, this problem got worse, and now we know more about it. Our system has an annotation system where any customer can apply update operations on their documents. When you asked me about secondary reads, I realize that in secondary reads, even find operations returned duplicated documents but in primary reads, only counts and aggregations get a wrong result (find only brings correct documents). In my environment we have several application servers constantly updating documents in our database, and if this update happens simultaneous with the balancer, a duplication is created and make our data inconsistent. We tried to run the cleanupOrphaned command, but this command cleans those duplicates slowly, and caused damage on our server performance. We developed an application to cleanup those duplicates without this command, simply with those operations: With that we were able to remove all duplicated documents in our database, but they are still repairing (because we have 2 new shards still in the balancer process). I stopped the balancer and create a balancer window out of normal working time. Unfortunately this will not fix the entire problem, since we have applications running into those time updating our documents. This behavior it's really inadmissible and may cause problems every time we try to scale our database with new shards. This problem aren't occurring only when we add more shards to our cluster, doing that just make this worse. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 24/Oct/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
1. Hello. I connected through the mongo shell to execute those commands, and I haven't executed any command with secondary reads. 2. I executed the command 'db.adminCommand( {flushRouterConfig:1})' and the count still returns 2 documents. 3. I will not use the cleanup command for now because I'm under a database internal operation here in my company and I can't afford using my time in this right now, since this can impact my shard performance. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 05/Sep/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, Thank you for the additional information, we're looking into the data you've provided.
Kind regards, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 05/Sep/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello. Sorry for the delay. I haven't tested this issue in every mentioned version, just the final one.
And the mongodump you request was uploaded to the scure portal. There is any workaround to fix this without impacting the performance on my cluster? | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 24/Aug/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, Thanks for the additional information. The orphan document may be on Repl_0, as the result of a failed migration or unclean shutdown. In this case, the aggregation query is targeting each shard since match stage does not contain the shard key prefix. However, despite the query being sent to each shard, we do not expect the aggregation query to return the orphan document. We're working to reproduce this issue. To help us continue to investigate, would you please answer the following questions?
I've created a secure portal for you to use to upload the dump. Thank you, | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Lucas [ 22/Aug/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hello anonymous.user. Here you are:
Connecting directly to the primary of HIDDENClusterRepl_3 and executing queries:
I found a inconsistent shard executing those same commands in every single shard. Even with hash pointing to the Repl_3, another shard have the same element. Look: Repl_0 (Inconsistent one):
Repl_1 (OK)
Repl_2 (OK)
Repl_3 (OK, with the element)
Repl_4 (OK)
Repl_5 (OK)
Repl_6 (OK)
mongos (duplication in counts and in some occasions, in the aggregations)
How did that document get there even with a hashed 'pid' index? | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 21/Aug/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Hi lucasoares, I'd like to determine whether these documents reside on the same node. To do this, would you please execute the following steps:
This information will allow us to focus our investigation into whether this behavior is the result of orphan documents. Thanks for your help, |