[SERVER-9888] GridFS should support sharding of the chunks collection with hashed shard keys Created: 10/Jun/13 Updated: 31/Jan/23 Resolved: 31/Jan/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.4.0, 2.4.4 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Robert Moore | Assignee: | Matt Panton |
| Resolution: | Won't Do | Votes: | 8 |
| Labels: | ShardingRoughEdges, community-team, gridfs, hashed, sharded-cluster, sharding-common-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Sharded gridfs. |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Description |
|
I have created a sharded cluster for use with GridFS. I would like to use a hashed shard key for the chunks to ensure the data is uniformly distributed across the cluster. When I try to use the filemd5 command to validate the files I receive an error saying the collection has to be indexed on the {{ {files_id:1}or {files_id:1, n:1}}}.
I think the set of acceptable indexes should be updated to include the hashed type. The stats() for the chunks collection is attached. |
| Comments |
| Comment by Matt Panton [ 31/Jan/23 ] | ||||||
|
At this time the team has decided to not pursue implementing a fix for the filemd5 command in a sharded environment as the filemd5 is now no longer supported for GridFS on the server. | ||||||
| Comment by Matt Kangas [ 23/Jun/14 ] | ||||||
|
Updating title to state the goal more clearly. Closing pull request per offline discussion with the developer. | ||||||
| Comment by Randolph Tan [ 28/Aug/13 ] | ||||||
|
It looks like the md5 command is failing on the shard itself since it cannot figure out the right index to use. To be more specific, it is failing on this part of the code:
Edit: showed code instead instead of github link since file was changed since last time and messed the line being highlighted. | ||||||
| Comment by Stennie Steneker (Inactive) [ 19/Jun/13 ] | ||||||
|
castiel: Can you please make a separate issue in the PERL project for your pull request? I've flagged this server issue as needing review for similar changes in other drivers. Thanks, | ||||||
| Comment by Mark Burazin [ 12/Jun/13 ] | ||||||
|
Stephen, I have made a perl driver patch too which is related to this change so it allows inserting in such sharded gridfs databases, should I make a ticket in the perl driver jira first or just make a pull request referencing this ticket? Here example error from the perl driver:
Thanks, | ||||||
| Comment by Robert Moore [ 12/Jun/13 ] | ||||||
|
Stephen - I wrapped the line. I can't modify the previous commit messages but the latest one has the ticket number. The contributor's agreement was done a while ago. Github user name is 'allanbank'. Rob. | ||||||
| Comment by Stennie Steneker (Inactive) [ 12/Jun/13 ] | ||||||
|
Hi Robert, Great, thanks for the pull request! Have you read our guide to Contributing to the MongoDB project? There are a few extra steps that will help this request be ready for review by our kernel team:
Regards, | ||||||
| Comment by Robert Moore [ 12/Jun/13 ] | ||||||
|
Stephen, I have created a pull request with the required changes: Rob. | ||||||
| Comment by Stennie Steneker (Inactive) [ 11/Jun/13 ] | ||||||
|
Hi Robert, The GridFS implementation as at MongoDB 2.4 currently only supports calculating md5 sums for fs.chunks collections sharded on either {files_id:1}or {files_id:1, n:1}. The filemd5 command is used to validate the uploaded files. I've raised Support for {files_id: hashed}seems a reasonable improvement. Regards, |