-
Type:
Task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
Python Drivers
-
Not Needed
-
-
None
-
None
-
None
-
None
-
None
-
None
Context
A customer using langchain_mongoDB fails at generating embeddings with
openai.RateLimitError: Error code: 429 - {'error': {'code': 'NoCapacity', 'message': 'The service is temporarily unable to process your request. Please try again later.'}}
The same code passes with the langchain_community library, but this library is deprecated.
The issue was because of the DEFAULT_INSERT_BATCH_SIZE being set to 100k in langchain_mongoDB vs 100 in langchain_community. The DEFAULT_INSERT_BATCH_SIZE was increased to optimize the end-to-end latency of the write path in the past, but needs to be tuned to make sure it just works for everyone.
Help ticket: https://jira.mongodb.org/browse/HELP-72713
Internal discussion: https://mongodb.slack.com/archives/C05M73LQ5TN/p1742574079625819
Definition of done
Update the DEFAULT_INSERT_BATCH_SIZE back to 100 and surface it more prominently in all the method arguments where its relevant, and add inline docs / comments so the user knows to tune this parameter to optimize the end-2-end latency.