Add option to infer BSON unmarshal type based on top-level input type

XMLWordPrintableJSON

    • Type: New Feature
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: BSON
    • None
    • None
    • Go Drivers
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      Starting with Go Driver v2, the BSON library always unmarshals to bson.D if there is no Go type information provided, including top-level documents and nested documents (GODRIVER-2819). While that solves the problems introduced by the inherently buggy "ancestor" logic (see GODRIVER-2407), it can also silently change the decoded type of nested documents when migrating from v1 to v2 (e.g. GODRIVER-3689, GODRIVER-3576). While users can configure a mongo.Client or bson.Decoder to unmarshal to a bson.M by default (or a map[string]any with GODRIVER-3697), that requires users to make global decisions about what type to unmarshal into. It would be better to provide a safe alternative to the "ancestor" logic that handles most of the common unmarshal cases.

      To help those customers who were depending on the v1 behavior and either can't or don't want to reason about the unmarshal type, we should add an option to infer the unmarshal type for nested documents based on the top-level type. To avoid the bugs of the "ancestor" logic, we should limit the inferred types to a specific set:

      • bson.D
      • bson.M
      • map[string]any

      Definition of Done

      • Add a method to bson.Decoder called something like InferDocumentType.
        • Document that any inferred type takes precedence over the default document type set by DefaultDocumentM or DefaultDocumentMap.
        • Add a corresponding bool field to options.BSONOptions with the same name and documentation.
      • Add logic to getEmptyInterfaceDecodeType to handle the new inferred type. The inferred type, if available, should take precedence over the default type.
        • Consider passing the top-level type via the DecodeContext, setting the type in Decoder.Decode if it's one of the allowed types. E.g:
          func (d *Decoder) Decode(val any) error {
          	// ...
          	rval := reflect.ValueOf(val)
          	switch rval.Type() {
          	case /* supported types */:
          		d.dc.topLevelDocumentType = rval.Type()
          		defer func() {
          			d.dc.topLevelDocumentType = nil
          		}()
          	}
          	// ...
          }
          
          func (...) getEmptyInterfaceDecodeType(dc DecodeContext, valueType Type) (...) {
          	isDocument := valueType == Type(0) || valueType == TypeEmbeddedDocument
          	if isDocument {
          		if d := dc.topLevelDocumentType; d != nil && dc.inferDocumentType {
          			return d
          		}
          		if dc.defaultDocumentType != nil {
          			// If the bsontype is an embedded document and the DocumentType is set on the DecodeContext, then return
          			// that type.
          			return dc.defaultDocumentType, nil
          		}
          	}
          	// ...
          }
          

      Questions:

      1. While the above described solution handles the case where users want nested document types to always match the top-level type (e.g. GODRIVER-3689, GODRIVER-3576), it doesn't handle other cases where the top-level type isn't one of the allowed types but does contain nested type information (e.g. GODRIVER-2407). Should we try to detect nested types, like fields in a struct? Or should we keep it simple and only use the top-level type?

      Pitfalls

      • We may add this feature and find that it's not what most users need. So far all post-2.0 tickets have wanted the nested documents to match the top-level document, but that pattern may suffer from survivorship bias or something else that makes it not a good representation of broader user requirements.

            Assignee:
            Unassigned
            Reporter:
            Matt Dale
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: