-
Type:
New Feature
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: django
-
None
Context
What is a DictField and how does it fit the Python ODM zeitgeist?
A DictField is a field type that stores an arbitrary Python dict as a BSON document, without going through JSON serialization. Conceptually:
- The application works with a plain python dict (nested structures allowed).
- The driver/ODM writes that structure directly as a MongoDB document (or sub-document) using native BSON types (e.g., ObjectId, datetime, Decimal128, etc.).
- Reads return the same shape back as a Python dict.
This pattern is already common across Python MongoDB ODMs (e.g., “dict”/“map” style fields for semi-structured or dynamic key/value data), and is used when you need flexible schema but don’t want to pay the cost or restrictions of JSON encoding/decoding or fully modeled embedded documents.
What is JSONField today and why doesn’t it suffice?
Django’s JSONField is an abstraction to represent JSON strings stored in a database and is designed around JSON semantics:
- Values are serialized via json.dumps and deserialized via json.loads;
- This inherently does not support MongoDB-native BSON types (ObjectId) without a custom encoder/decoder and overriding JSONField could interfere with other complementary backends in someone's codebase.
What problem does DictField solve?
DictField directly addresses:
- BSON-native storage: Store and retrieve dictionaries that contain ObjectId, datetime, and other BSON types without crashes, custom encoders, or JSON shims.
- Intuitive querying: Allow queries that align with MongoDB’s document model (dot-notation into nested dicts) instead of requiring users to reason about JSON-encoded BSON payloads.
- Better mental model: For Django + MongoDB users, DictField matches the expectation that “a field can just be a MongoDB sub-document,” in contrast to JSONField’s JSON-centric behavior.
- Reduced complexity: Avoid embedding BSON-aware behavior into JSONField (which is meant to stay JSON-only) and keep BSON-specific concerns in a MongoDB-specific field type.
Acceptance Criteria
Create a Technical Design document for DictField and implement DictField per the design specification.
- API & field definition
-
- Introduce DictField in django_mongodb_backend.fields with a clear contract: accepts any Python dict whose values are serializable by the underlying MongoDB/BSON layer.
-
- Ensure it is clearly documented as distinct from JSONField and positioned as “BSON-native dict storage.”
- Serialization / deserialization
-
- Implement from_db_value, to_python, and get_prep_value so that:
-
-
- Values are stored as BSON documents via the MongoDB backend (no JSON encoding/decoding layer).
-
-
-
- Reads come back as Python dict with BSON types preserved.
-
-
- Wire this through DatabaseOperations so DictField bypasses the JSON conversion path that JSONField uses today.
- Query behavior
-
- Define/verify how ORM lookups map onto MongoDB queries:
-
-
- Nested field lookups (e.g., mydict_foo_bar) translate to Mongo-style dot-notation.
-
-
-
- Equality and containment behavior is clearly defined and tested.
-
-
- Optionally add any DictField-specific lookups if we need behavior that diverges from existing JSONField or dict-like lookups.
- Validation, docs, and examples
-
- Add validation rules that align with what the MongoDB backend can actually store (e.g., warn if values are not BSON-serializable).
-
- Update documentation to:
-
-
- Contrast DictField vs JSONField.
-
-
-
- Provide examples using BSON types (ObjectId, datetime) and nested structures.
-
-
- Add tests for:
-
-
- Save/load round-trips of nested dicts containing BSON types.
-
-
-
- Querying into nested keys.
-
-
-
- Behavior alongside embedded model fields, so users understand when to pick DictField vs an embedded model.
-
- Migration / compatibility notes (if needed)
-
- Call out that DictField is not a drop-in JSONField replacement and that existing JSONField data remains JSON-based.
-
- Optionally document a manual migration pattern for users who experimented with BSON values in JSONField and want to move to DictField.
Pitfall
- This would be a "grab-bag" field that allows folks to insert more arbitrary python dictionaries. This isn't inherently bad, but we should push folks to use Embedded Models
- Querying should be scoped and will most likely be limited since there is no type-safety. It may also be fine to remove any querying or query validation.
Additional Information
Prior Context (User Issue):
As requested on the community forum, it would be useful to have JSONField support BSON data types. Unfortunately, this may require a custom field or a change to Django's built-in JSONField.
Alternatively, I worked up a quick example using a custom decoder/encoder that appears to work to some extent:
diff --git a/django_mongodb_backend/operations.py b/django_mongodb_backend/operations.py index cb1e93d..dcdf7fb 100644 --- a/django_mongodb_backend/operations.py +++ b/django_mongodb_backend/operations.py @@ -147,7 +147,9 @@ class DatabaseOperations(BaseDatabaseOperations): Convert dict data to a string so that JSONField.from_db_value() can decode it using json.loads(). """ - return json.dumps(value) + target = getattr(expression, 'target', None) + encoder = target.encoder if target else None + return json.dumps(value, cls=encoder) def convert_timefield_value(self, value, expression, connection): if value is not None: diff --git a/tests/model_fields_/models.py b/tests/model_fields_/models.py index b25b94a..b8fa3c0 100644 --- a/tests/model_fields_/models.py +++ b/tests/model_fields_/models.py @@ -6,6 +6,25 @@ from django_mongodb_backend.fields import ArrayField, EmbeddedModelField, Object from django_mongodb_backend.models import EmbeddedModel +import json +from bson import json_util + + +class BSONEncoder(json.JSONEncoder): + def encode(self, obj): + return json_util.dumps(obj) + + +class BSONDecoder(json.JSONDecoder): + def decode(self, obj): + return json_util.loads(obj) + + +# JSONField +class JSONModel(models.Model): + value = models.JSONField(encoder=BSONEncoder, decoder=BSONDecoder) + + # ObjectIdField class ObjectIdModel(models.Model): field = ObjectIdField() diff --git a/tests/model_fields_/test_jsonfield.py b/tests/model_fields_/test_jsonfield.py new file mode 100644 index 0000000..a860b7b --- /dev/null +++ b/tests/model_fields_/test_jsonfield.py @@ -0,0 +1,15 @@ +from django.db.models import JSONField +from django.test import TestCase +from bson import ObjectId + +from .models import JSONModel + + +class TestSaveLoad(TestCase): + def test_null(self): + object_id = ObjectId() + value = {"key": object_id} + obj = JSONModel(value=value) + obj.save() + obj.refresh_from_db() + self.assertEqual(obj.value, value)
In order for querying to work as expected, the field would require custom Django lookups to prepare BSON types into the dictionary format the bson.json_util.dumps() outputs, e.g. {}{"$oid": "67d5f6f5188e40f99400bc15"} for ObjectId.