[SERVER-45265] Incosistant sharding overhead Created: 20/Dec/19 Updated: 27/Oct/23 Resolved: 22/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Moditha Hewasinghage | Assignee: | Dmitry Agranat |
| Resolution: | Community Answered | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Participants: |
| Description |
|
I ran a few experiments to compare sharding vs non-sharded mongo instances. I have limited the mongod memory to 256MB in both scenarios and disabled compression. I use a synthetic dataset of 80-byte documents and change the number of documents of the collections and collect the average runtime to retrieve a random document by id. In the sharded environment I have ranged sharding on the id and the data is equally distributed among the 3 servers (no replication). Here is the result that I got. (80-2m means 2,000,000 documents of 80 bytes) As expected there is an overhead associated with sharding. However, my question is that shouldn't this overhead be a constant? why does the difference between the runtime of the sharded and the non-sharded instance is getting more with more documents? As far as I can see this overhead should be independent of the document counts. I have attached all the logs of each experiment ( each collection used different db locations) of the config, mongods and the mongos. The configurations I used is also attached |
| Comments |
| Comment by Moditha Hewasinghage [ 23/Dec/19 ] |
|
@Dimitry Agranat I think you misunderstood the graph. This bottleneck is the memory when the indexes do not fit in memory. Regardless of the value in 64 million have a look at the graph until 32 million. the gap between sharded and non-sharded at 2 million is way less than when it is 32 million. If the sharding has a constant overhead this should not happen. |
| Comment by Dmitry Agranat [ 22/Dec/19 ] |
|
modithadha88@gmail.com, based on your graph, it looks like you are hitting some bottleneck around 32 million docs, both on sharding and non-sharded instances. So the reported "overhead" looks consistent between both configurations. The SERVER project is for bugs and feature suggestions for the MongoDB server. As this ticket does not appear to be a bug, I will now close it. If you need further assistance troubleshooting, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag. |