[SERVER-32761] Missing document Created: 18/Jan/18 Updated: 27/Oct/23 Resolved: 21/Mar/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.4.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tudor Aursulesei | Assignee: | Kaloian Manassiev |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
I've moved some data manually, using moveChunk, and a document is missing. I can find it manually, by querying a shard which has it, but when i make the same query from the sharded shell it's not there. The balancer is now disabled.
So the cluster thinks that that _id is located on rs4, when it actually resides on rs1. I know from experience that restarting everything a couple of times fix this issue, but i'd rather not do that. |
| Comments |
| Comment by Kaloian Manassiev [ 21/Mar/18 ] | ||
|
Hi thestick613, You are right, I was confused by the ticket description. This is most likely orphaned document on rs1, which some time in the past has been moved to rs4, but its cleanup on rs1 didn't complete. Because of this it is still visible there. Since this behaviour is expected, I am closing this ticket as 'Works as Designed'. Best regards, | ||
| Comment by Tudor Aursulesei [ 15/Mar/18 ] | ||
|
I've read about this for a while. Isn't this just an orphaned document? https://docs.mongodb.com/manual/reference/glossary/#term-orphaned-document | ||
| Comment by Kaloian Manassiev [ 09/Mar/18 ] | ||
|
Hi thestick613,
If this problem is still reproducible, can you please provide us with the following information:
Thank you in advance. Best regards, | ||
| Comment by Tudor Aursulesei [ 08/Mar/18 ] | ||
|
The script isn't very reliable, because documents disappear between searching them on the shard and searching them on the 'cluster'/mongos. I've changed it to iterate all documents on a shard, and then search them with find().limit(1).explain(). If the query planner router gives me a different shard than the one i'm currently processing, i run moveChunk. I've found a few more 'zombie' documents using this approach. Still not sure i've fixed all the discrepancies, because documents come and go. | ||
| Comment by Tudor Aursulesei [ 06/Mar/18 ] | ||
|
No, i wasn't running with slaveOk=true. There still are some documents that i'm able to find on a shard, but unable to find on the full cluster. I know that querying specific shards while clustered is not a good idea. I've been running a script which tries to find such occurrences, and when it finds it, it runs moveChunk so that the missing document would reside on the shard it was found. | ||
| Comment by Kaloian Manassiev [ 23/Feb/18 ] | ||
|
Hi thestick613, My apologies for the delayed response. Running moveChunk with the balancer enabled, while not recommended, should not cause any correctness problems. From the explain output, I see that the version you were using is 3.4.10. Please correct me if I am wrong (I also updated the ticket's affected version field). I have a couple of follow-up questions:
Thank you in advance. Best regards, | ||
| Comment by Tudor Aursulesei [ 18/Jan/18 ] | ||
|
I might have triggered this myself, by running moveChunk with the balancer enabled. I'm currently trying to find documents that exist on a replica set and not exist in a shard and i'm running moveChunk with the found _id towards the replica set that should have them. | ||
| Comment by Tudor Aursulesei [ 18/Jan/18 ] | ||
|
I've managed to fix this by issuing two new moveComands in python
I'm not sure which one of it did the trick, but it's okay now. Restarting mongo config servers didn't to anything. |