-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Python Drivers
-
None
-
None
-
None
-
None
-
None
-
None
Context
Currently the way to initialize MongoDB Atlas Vector Search is via passing an initialized collection
vector_store = MongoDBAtlasVectorSearch(
collection=MONGODB_COLLECTION,
embedding=embeddings,
index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
relevance_score_fn="cosine",
)
This requires the user to do a bunch of pre-work in initializing the collection, which involves steps like
# initialize MongoDB python client client = MongoClient(MONGODB_ATLAS_CLUSTER_URI) DB_NAME = "langchain_test_db" COLLECTION_NAME = "langchain_test_vectorstores" ATLAS_VECTOR_SEARCH_INDEX_NAME = "langchain-test-index-vectorstores" MONGODB_COLLECTION = client[DB_NAME][COLLECTION_NAME]
If we can modify the MongoDBAtlasVectorSearch to also allow for passing the required params (collection_name, db_name, cluster_uri) we can initialize the client ourselves.
vector_store = MongoDBAtlasVectorSearch(
collection=MONGODB_COLLECTION, :: OPTIONAL
collection_name= XXX
db_name= XXX
connection_string = XXX
embedding=embeddings,
index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
relevance_score_fn="cosine", )
We want to maintain backward compatibility, and don't want to create a breaking change.
To ideate on the options:
- The earlier 'collection' was a required field, which will can be made optional? The new parameters can be optional as well
- Make another init method that can provide the desired param (and not have the "collection" param)?
Definition of done
What must be done to consider the task complete?
Pitfalls
What should the implementer watch out for? What are the risks?