[Langchain] Infer dimensions from embedding if not provided

XMLWordPrintableJSON

    • Python Drivers
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      We currently require a user to know and provide the dimension of an embedding model even when we create the vector index for them.

      This can easily be inferred from the Embeddings model: n = len(embedding.embed_query("foo"))

      See https://github.com/langchain-ai/langchain-mongodb/blob/main/libs/langchain-mongodb/langchain_mongodb/vectorstores.py#L210

      Expected Logic:

      (relevant arguments in constructor: auto_create_index, dimensions, embedding

      if auto_create_index == True and dimensions == -1: automatically set the index and change dimensions to the provided length

      if auto_create_index == False and dimensions == -1:

          case 1. if NO index is found, throw an exception

          case 2. if INDEX is found, ensure index's embedding size and the embedders size match up. If not, throw an error. 

      Definition of done

      Make the change shown above; pass tests; document behavior; bump version and release.

      Pitfalls

      We need to consider versioning and compatibility. This would be changing the default so a breaking change.

              Assignee:
              Steve Silvester
              Reporter:
              Casey Clements
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: