[CSHARP-2043] Add GridFSBucketOptions.SuppressEnsureIndexes Created: 25/Sep/17 Updated: 31/Mar/22 |
|
| Status: | Backlog |
| Project: | C# Driver |
| Component/s: | GridFS |
| Affects Version/s: | 2.4.4 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | John Gibbons | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | driver, gridfs, sharding | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
All |
||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Description |
|
The GridFS Driver currently calls EnsureIndexes during UploadFromStream. This is normally highly convenient because it ensures that files/chunks tables are correctly set up, with zero admin setup required. However, there is one highly significant downside (plus one less important downside). Downside 1: if you have SHARDED files/chunks collections, then the EnsureIndexes routine will fail when ANY shard is unavailable (even if the write operation pertains to a shard that is perfectly fine). This is because the EnsureIndexes routine issues a "find" operation and that find operation gets broadcast to all shards - so if one shard is down, the find will fail and so the whole write will fail. In my case, users in London are unable to save to London data centre, because New York data centre is currently offline - bad! Downside 2: Running EnsureIndexes requires user to have find permission, list (and potentially create) indexes permission. In my use case, I do not want user to have find permission (they can write but cannot read). |
| Comments |
| Comment by John Gibbons [ 08/Mar/18 ] | |||||||||||||||||||||||||||||||||||
|
HI Robert, glad to see this has been scheduled. Looking forward to seeing it out in the wild and let me know if anything I can do to help. NOte: there is already a pull request with the fix included so hopefully that helps. Rgds, JG | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 11/Oct/17 ] | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 11/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
https://jira.mongodb.org/browse/SERVER-31511 raised to report issue. Working on AssumeIndexesExist fix now .... | |||||||||||||||||||||||||||||||||||
| Comment by Robert Stam [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Also, if you think the server is not treating AllowPartialResults properly would you be willing to create a SERVER ticket reporting that? | |||||||||||||||||||||||||||||||||||
| Comment by Robert Stam [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Thanks so much for all the work you have done investigating this. Based on your findings I think we should conclude that the AllowPartialResults = true approach is not going to work. So I think we should return to your original proposal. I do have one suggestion though. We try to avoid words like "Suppress" in options. Can we find another name for this new option? Perhaps options.AssumeIndexesExist? | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
THIS ONE PRETTY MUCH SAME EXCEPT SETS UP 2 SHARDS BACKED BY [single mongod] REPLICA SETS. partial DOES NOT HAVE DESIRED EFFECT IF ONE MONGOD KILLED.
[mongod@BLAH PartialTest2]$ mongo localhost:40002/partial --quiet --eval 'db.test.find()' { "_id" : ObjectId("59dcc375f955bba2a5b15850"), "key" : "B" } { "_id" : ObjectId("59dcc375f90654100e5edb0e"), "key" : "Y" }[mongod@BLAH PartialTest2]$ mongo localhost:40002/partial --quiet --eval 'db.test.find().addOption(DBQuery.Option.partial)' { "_id" : ObjectId("59dcc375f955bba2a5b15850"), "key" : "B" } { "_id" : ObjectId("59dcc375f90654100e5edb0e"), "key" : "Y" }[mongod@BLAH PartialTest2]$ kill -2 $(pgrep -a mongo | grep "40011" | awk ' {print $1}') [mongod@BLAHPartialTest2]$ mongo localhost:40002/partial --quiet --eval 'db.test.find().addOption(DBQuery.Option.partial)' Error: error: { "ok" : 0, "errmsg" : "could not find host matching read preference { mode: \"primary\", tags: [ {} ] } for set shard1", "code" : 133, "codeName" : "FailedToSatisfyReadPreference" } | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
THIS ONE SETS UP 2 SHARDS BACKED DIRECTLY BY SINGLE MONGODS
Results: mongos> exit bye [mongod@BLAH PartialTest]$ kill -2 $(pgrep -a mongo | grep "30011" | awk ' {print $1}') [mongod@BLAH PartialTest]$ mongo localhost:30002/partial --quiet --eval 'db.test.find().addOption(DBQuery.Option.partial)' #should return 1 { "_id" : ObjectId("59dcb0f37950b2dc5ff66fd5"), "key" : "Y" }[mongod@BLAH PartialTest]$ mongo localhost:30002/partial --quiet --eval 'db.test.find()' #should error because no partial Error: error: { "ok" : 0, "errmsg" : "Connection refused", "code" : 6, "codeName" : "HostUnreachable" } | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
OK I see what the issue is. the DBQuery.Option.partial works if the sharded cluster is a cluster of SINGLE MONGODs. Whereas it does not work if the sharded cluster is a cluster of REPLICA SETS. Attaching evidence shortly. | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 10/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Hi Robert, It's not really working for me as yet. Firstly, I added the AllowPartialResults = true in C# code, but it made no difference - that is to say it continued to report an error saying that one of the shards was down. Then I tried same from the mongo client, and got what I believe to be same result [1]. In both cases I'm executing against a mongos pointing to a config replica set which governs a 3 shard cluster. One shard I've killed the 2 mongods. I'm continuing to research this (e.g. by building up a second, completely new cluster from scratch to retest all this) - but in the meantime, any thoughts? A quick question, how does one officially set the partial results cursor option from the command line in Mongo 3.4 - In the documentation [2] it says that the cursor.addOption feature has been deprecated in shell since 3.2, and use cursor methods instead, but there is no cursor method for partial results afaik. Makes me wonder whether there are any issues around the partial cursor concept ... [1] [2] Note | |||||||||||||||||||||||||||||||||||
| Comment by Robert Stam [ 06/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Adding AllowPartialResults = true will change the resulting error in certain scenarios. I assume that if all shards are down a FindOperation with AllowPartialResults = true will return 0 documents instead of an error. This will cause the driver to conclude the fs.files collection is empty and proceed to call EnsureIndexes, which will fail because all shards are down. Either way an error would have occurred, but the error is now different. A similar situation could arise when all the shards that are up are empty (and the shards that are down are not empty). This will also result in the driver erroneously concluding that the fs.files collection is empty. Let me know what your test reveals. And what you think about the possible change in errors reported. | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 06/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Thanks for suggestion Robert! I'd feel most comfortable to do a quick test on it just to triple check it does as you suggest and if so I'll raise a new PR (I'm assuming that is what's most convenient for you ...) Rgds, JG | |||||||||||||||||||||||||||||||||||
| Comment by Robert Stam [ 06/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Rather than adding a new knob which must be configured I think it would be better to change the Find operation that is used to check if the files collection is empty to set AllowPartialResults to true. This will allow the Find to return without error even if one shard is down. Would you like to submit a new pull request that adds:
when creating the FindOperation here: Or just let me know if you would prefer for me to make the change for you. Thanks for submitting this. | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 06/Oct/17 ] | |||||||||||||||||||||||||||||||||||
|
Pull Request Submitted | |||||||||||||||||||||||||||||||||||
| Comment by John Gibbons [ 25/Sep/17 ] | |||||||||||||||||||||||||||||||||||
|
forgot to mention, proposed solution would be as follows: 1. Create a new field in GridFSBucketOptions (and its immutable counterpart) called SuppressEnsureIndexes. This defaults to false, and thus the default behaviour remains unchanged for 100% backward compatibility. /// <summary> } private void EnsureIndexes(IReadWriteBindingHandle binding, CancellationToken cancellationToken) try |