-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Affects Version/s: None
-
Component/s: AI/ML
-
None
-
Python Drivers
-
None
-
None
-
None
-
None
-
None
-
None
Context
The following is part of a conversation with Claude Code about the Code Agent Developer Experience.
A. LLMs consistently choose MotorClient over pymongo.AsyncMongoClient. How do we change this?
This is a pure training-data problem with a known shape. Motor has been the async answer in Python MongoDB code on the public internet since ~2015. AsyncMongoClient shipped in PyMongo 4.9 (late 2024). Every model whose training cutoff predates that — which is most of them, including me on cold context — saw 10+ years of "use Motor for async" and a few months of "use AsyncMongoClient." The prior is overwhelming and won't decay until the training distribution catches up.
Three levers, ordered by effect:
- Deprecate Motor publicly and loudly. Not in a footnote on the PyMongo docs — on Motor's own README, in big letters: "Motor is in maintenance mode. New code should use pymongo.AsyncMongoClient. Here's a 5-line migration." Then add a DeprecationWarning to motor.motor_asyncio.AsyncIOMotorClient._init_. Models read deprecation warnings; users complain about them; the complaints become Stack Overflow posts; the Stack Overflow posts become training data. This is the only mechanism that actually rewrites the prior.
- Migration codemod. Ship python -m pymongo.migrate_from_motor. A working tool that rewrites an existing codebase removes the "but I have a lot of Motor code" objection, which is what users tell agents, which is what agents repeat back.
- Update every official MongoDB-authored Python example to AsyncMongoClient. Every single one. Search the org's GitHub for AsyncIOMotorClient and open PRs. The langchain-mongodb repo, the docs, the university content, the blog posts, every conference talk's code repo. Today, if I retrieve current MongoDB docs and they show Motor, I conclude Motor is the right answer. Don't give me that signal.
The first one is doing the most work. Everything else amplifies it.
Definition of done
What must be done to consider the task complete?
Pitfalls
What should the implementer watch out for? What are the risks?