[langchain] Update QueryMongoDBCheckerTool agent_toolkit

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Affects Version/s: None
    • Component/s: LangChain
    • None
    • None
    • Python Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      apoorva.joshi@mongodb.com writes the following:

      Hi folks! I have been building a text-to-query agents course with Datacamp (launching soon!) so got a chance to extensively test our database toolkit in LangChain. I have seen this behavior a few times that I think we can improve:During query generation, field names are sometimes not wrapped in quotes. The query_checker tool does not fix this either and as a result, the query execution ultimately fails. See an example here:

      ================================== Ai Message ==================================
      Tool Calls: mongodb_query_checker (call_eSVEmcuccLdQaQvuy1srGRLU) Call ID: call_eSVEmcuccLdQaQvuy1srGRLU Args: query: db.theaters.aggregate([ { "$group": { _id: "$location.address.state", count:
      { "$sum": 1 }
      } }, { "$sort":
      { count: -1 }
      }, { "$limit": 5 } ])
      ================================= Tool Message =================================
      content='```javascript\ndb.theaters.aggregate([\n { $group: { _id: "$location.address.state", count:
      { $sum: 1 }
      } },\n { $sort:
      { count: -1 }
      },\n { $limit: 5 }\n])\n```' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 56, 'prompt_tokens': 144, 'total_tokens': 200, 'completion_tokens_details':
      {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}
      , 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-11-20', 'system_fingerprint': 'fp_ee1d74bde0', 'id': 'chatcmpl-CDYzuhpHrOspfJIv4Ejk0Lix0Rqmc', 'service_tier': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate':
      {'filtered': False, 'severity': 'safe'}
      , 'jailbreak': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate':
      {'filtered': False, 'severity': 'safe'}
      , 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}} id='run--e6f77828-e495-47fa-8884-f51ed9709a22-0' usage_metadata={'input_tokens': 144, 'output_tokens': 56, 'total_tokens': 200, 'input_token_details':
      {'audio': 0, 'cache_read': 0}
      , 'output_token_details': {'audio': 0, 'reasoning': 0}} ================================== Ai Message ================================== Tool Calls: mongodb_query (call_rPA0Sch1Kl51q5epAfjzcsQL) Call ID: call_rPA0Sch1Kl51q5epAfjzcsQL Args: query: db.theaters.aggregate([ { "$group": { _id: "$location.address.state", count:
      { "$sum": 1 }
      } }, { "$sort":
      { count: -1 }
      }, { "$limit": 5 } ]) --------------------------------------------------------------------------- NameError Traceback (most recent call last) /usr/local/lib/python3.12/dist-packages/langchain_mongodb/agent_toolkit/database.py in _parse_command(self, command) 233 } --> 234 agg_pipeline = eval(agg_str, eval_globals) 235 if not isinstance(agg_pipeline, list): <string> in <module> NameError: name '_id' is not defined The above exception was the direct cause of the following exception: ValueError Traceback (most recent call last) 14 frames /usr/local/lib/python3.12/dist-packages/langchain_mongodb/agent_toolkit/database.py in _parse_command(self, command) 237 return agg_pipeline 238 except Exception as e: --> 239 raise ValueError(f"Failed to parse aggregation pipeline: {e}") from e 240 241 def run(self, command: str) -> Union[str, Cursor]: ValueError: Failed to parse aggregation pipeline: name '_id' is not defined
        

      I think there's two places to improve this: * Include a line about adding quotes to field and operator names in the query generation system prompt

      • Add code to check and fix this in the mongodb_checker tool

      Definition of done

      Implement her two suggested improvements, one in the LLM Prompt, and the other in 
      QueryMongoDBCheckerTool. 

      Pitfalls

      What should the implementer watch out for? What are the risks?

            Assignee:
            Unassigned
            Reporter:
            Casey Clements
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: