-
Type:
Improvement
-
Resolution: Fixed
-
Priority:
Unknown
-
Affects Version/s: None
-
Component/s: Configuration
-
None
-
(copied to CRM)
-
Java Drivers
-
Not Needed
Product Use Case
As a user of the MongoDB Spark connector,
I want the connectors streaming offsets to work with azure storage in the way described in their documentation.
User Impact
Currently, the implementation uses sparks own helper to create the hadoop configuration via: sparkContext.hadoopConfiguration(); However, this does not include the `fs.azure.*` configuration as described in the azure storage documentation. eg:
sparkConf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "OAuth") sparkConf.set("fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider") sparkConf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", "<application-id>") sparkConf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net", service_credential) sparkConf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", "https://login.microsoftonline.com/<directory-id>/oauth2/token")
Acceptance Criteria
Implementation Requirements
- Allow users to set any filesystem configurations prefixed with `fs.` and add them to the hadoop configuration as if the were configured with the prefix: `spark.hadoop.fs.`