Add DictField to the Django-MongoDB-Backend

XMLWordPrintableJSON

    • Type: New Feature
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: django
    • None
    • Python Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      What is a DictField and how does it fit the Python ODM zeitgeist?

      MongoEngine Example

      A DictField is a field type that stores an arbitrary Python dict as a BSON document, without going through JSON serialization. Conceptually:

      • The application works with a plain python dict (nested structures allowed).
      • The driver/ODM writes that structure directly as a MongoDB document (or sub-document) using native BSON types (e.g., ObjectId, datetime, Decimal128, etc.).
      • Reads return the same shape back as a Python dict.

      This pattern is already common across Python MongoDB ODMs (e.g., “dict”/“map” style fields for semi-structured or dynamic key/value data), and is used when you need flexible schema but don’t want to pay the cost or restrictions of JSON encoding/decoding or fully modeled embedded documents. 

      What is JSONField today and why doesn’t it suffice?

      Django’s JSONField is an abstraction to represent JSON strings stored in a database and is designed around JSON semantics:

      • Values are serialized via json.dumps and deserialized via json.loads;
        • This inherently does not support MongoDB-native BSON types (ObjectId) without a custom encoder/decoder and overriding JSONField could interfere with other complementary backends in someone's codebase.

      What problem does DictField solve?

      DictField directly addresses:

      • BSON-native storage: Store and retrieve dictionaries that contain ObjectId, datetime, and other BSON types without crashes, custom encoders, or JSON shims.
      • Intuitive querying: Allow queries that align with MongoDB’s document model (dot-notation into nested dicts) instead of requiring users to reason about JSON-encoded BSON payloads.
      • Better mental model: For Django + MongoDB users, DictField matches the expectation that “a field can just be a MongoDB sub-document,” in contrast to JSONField’s JSON-centric behavior.
      • Reduced complexity: Avoid embedding BSON-aware behavior into JSONField (which is meant to stay JSON-only) and keep BSON-specific concerns in a MongoDB-specific field type.

      Acceptance Criteria

      Create a Technical Design document for DictField and implement DictField per the design specification.

      • API & field definition
        • Introduce DictField in django_mongodb_backend.fields with a clear contract: accepts any Python dict whose values are serializable by the underlying MongoDB/BSON layer.
        • Ensure it is clearly documented as distinct from JSONField and positioned as “BSON-native dict storage.”
      • Serialization / deserialization
        • Implement from_db_value, to_python, and get_prep_value so that:
          • Values are stored as BSON documents via the MongoDB backend (no JSON encoding/decoding layer).
          • Reads come back as Python dict with BSON types preserved.
        • Wire this through DatabaseOperations so DictField bypasses the JSON conversion path that JSONField uses today.
      • Query behavior
        • Define/verify how ORM lookups map onto MongoDB queries:
          • Nested field lookups (e.g., mydict_foo_bar) translate to Mongo-style dot-notation.
          • Equality and containment behavior is clearly defined and tested.
        • Optionally add any DictField-specific lookups if we need behavior that diverges from existing JSONField or dict-like lookups.
      • Validation, docs, and examples
        • Add validation rules that align with what the MongoDB backend can actually store (e.g., warn if values are not BSON-serializable).
        • Update documentation to:
          • Contrast DictField vs JSONField.
          • Provide examples using BSON types (ObjectId, datetime) and nested structures.
        • Add tests for:
          • Save/load round-trips of nested dicts containing BSON types.
          • Querying into nested keys.
          • Behavior alongside embedded model fields, so users understand when to pick DictField vs an embedded model.
      • Migration / compatibility notes (if needed)
        • Call out that DictField is not a drop-in JSONField replacement and that existing JSONField data remains JSON-based.
        • Optionally document a manual migration pattern for users who experimented with BSON values in JSONField and want to move to DictField.

      Pitfall

      • This would be a "grab-bag" field that allows folks to insert more arbitrary python dictionaries. This isn't inherently bad, but we should push folks to use Embedded Models
      • Querying should be scoped and will most likely be limited since there is no type-safety. It may also be fine to remove any querying or query validation.

      Additional Information

      Prior Context (User Issue):

      As requested on the community forum, it would be useful to have JSONField support BSON data types. Unfortunately, this may require a custom field or a change to Django's built-in JSONField.

      Alternatively, I worked up a quick example using a custom decoder/encoder that appears to work to some extent:

      diff --git a/django_mongodb_backend/operations.py b/django_mongodb_backend/operations.py
      index cb1e93d..dcdf7fb 100644
      --- a/django_mongodb_backend/operations.py
      +++ b/django_mongodb_backend/operations.py
      @@ -147,7 +147,9 @@ class DatabaseOperations(BaseDatabaseOperations):
               Convert dict data to a string so that JSONField.from_db_value() can
               decode it using json.loads().
               """
      -        return json.dumps(value)
      +        target = getattr(expression, 'target', None)
      +        encoder = target.encoder if target else None
      +        return json.dumps(value, cls=encoder)
       
           def convert_timefield_value(self, value, expression, connection):
               if value is not None:
      diff --git a/tests/model_fields_/models.py b/tests/model_fields_/models.py
      index b25b94a..b8fa3c0 100644
      --- a/tests/model_fields_/models.py
      +++ b/tests/model_fields_/models.py
      @@ -6,6 +6,25 @@ from django_mongodb_backend.fields import ArrayField, EmbeddedModelField, Object
       from django_mongodb_backend.models import EmbeddedModel
       
       
      +import json
      +from bson import json_util
      +
      +
      +class BSONEncoder(json.JSONEncoder):
      +    def encode(self, obj):
      +        return json_util.dumps(obj)
      +
      +
      +class BSONDecoder(json.JSONDecoder):
      +    def decode(self, obj):
      +        return json_util.loads(obj)
      +
      +
      +# JSONField
      +class JSONModel(models.Model):
      +    value = models.JSONField(encoder=BSONEncoder, decoder=BSONDecoder)
      +
      +
       # ObjectIdField
       class ObjectIdModel(models.Model):
           field = ObjectIdField()
      diff --git a/tests/model_fields_/test_jsonfield.py b/tests/model_fields_/test_jsonfield.py
      new file mode 100644
      index 0000000..a860b7b
      --- /dev/null
      +++ b/tests/model_fields_/test_jsonfield.py
      @@ -0,0 +1,15 @@
      +from django.db.models import JSONField
      +from django.test import TestCase
      +from bson import ObjectId
      +
      +from .models import JSONModel
      +
      +
      +class TestSaveLoad(TestCase):
      +    def test_null(self):
      +        object_id = ObjectId()
      +        value = {"key": object_id}
      +        obj = JSONModel(value=value)
      +        obj.save()
      +        obj.refresh_from_db()
      +        self.assertEqual(obj.value, value)

      In order for querying to work as expected, the field would require custom Django lookups to prepare BSON types into the dictionary format the bson.json_util.dumps() outputs, e.g. {}{"$oid": "67d5f6f5188e40f99400bc15"} for ObjectId.

       

       

            Assignee:
            Unassigned
            Reporter:
            Tim Graham
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: