Context
Starting with Go Driver v2, the BSON library always unmarshals to bson.D if there is no Go type information provided, including top-level documents and nested documents (GODRIVER-2819). While that solves the problems introduced by the inherently buggy "ancestor" logic (see GODRIVER-2407), it can also silently change the decoded type of nested documents when migrating from v1 to v2 (e.g. GODRIVER-3689, GODRIVER-3576). While users can configure a mongo.Client or bson.Decoder to unmarshal to a bson.M by default (or a map[string]any with GODRIVER-3697), that requires users to make global decisions about what type to unmarshal into. It would be better to provide a safe alternative to the "ancestor" logic that handles most of the common unmarshal cases.
To help those customers who were depending on the v1 behavior and either can't or don't want to reason about the unmarshal type, we should add an option to infer the unmarshal type for nested documents based on the top-level type. To avoid the bugs of the "ancestor" logic, we should limit the inferred types to a specific set:
- bson.D
- bson.M
- map[string]any
Definition of Done
- Add a method to bson.Decoder called something like InferDocumentType.
- Document that any inferred type takes precedence over the default document type set by DefaultDocumentM or DefaultDocumentMap.
- Add a corresponding bool field to options.BSONOptions with the same name and documentation.
- Add logic to getEmptyInterfaceDecodeType to handle the new inferred type. The inferred type, if available, should take precedence over the default type.
- Consider passing the top-level type via the DecodeContext, setting the type in Decoder.Decode if it's one of the allowed types. E.g:
func (d *Decoder) Decode(val any) error { // ... rval := reflect.ValueOf(val) switch rval.Type() { case /* supported types */: d.dc.topLevelDocumentType = rval.Type() defer func() { d.dc.topLevelDocumentType = nil }() } // ... } func (...) getEmptyInterfaceDecodeType(dc DecodeContext, valueType Type) (...) { isDocument := valueType == Type(0) || valueType == TypeEmbeddedDocument if isDocument { if d := dc.topLevelDocumentType; d != nil && dc.inferDocumentType { return d } if dc.defaultDocumentType != nil { // If the bsontype is an embedded document and the DocumentType is set on the DecodeContext, then return // that type. return dc.defaultDocumentType, nil } } // ... }
- Consider passing the top-level type via the DecodeContext, setting the type in Decoder.Decode if it's one of the allowed types. E.g:
Questions:
- While the above described solution handles the case where users want nested document types to always match the top-level type (e.g.
GODRIVER-3689,GODRIVER-3576), it doesn't handle other cases where the top-level type isn't one of the allowed types but does contain nested type information (e.g.GODRIVER-2407). Should we try to detect nested types, like fields in a struct? Or should we keep it simple and only use the top-level type?
Pitfalls
- We may add this feature and find that it's not what most users need. So far all post-2.0 tickets have wanted the nested documents to match the top-level document, but that pattern may suffer from survivorship bias or something else that makes it not a good representation of broader user requirements.
- blocks
-
GODRIVER-3747 Add a BSONOptions value that provides close to v1 compatible behavior
-
- Blocked
-
- is related to
-
GODRIVER-2407 BSON Unmarshal to struct uses incorrect type alias for interface{}
-
- Closed
-
-
GODRIVER-3697 Add DefaultDocumentMap() decoder method
-
- Closed
-
-
GODRIVER-2819 Make BSON decode to bson.D if there is no type information
-
- Closed
-