[PyMongoArrow] Projection fails when reading list-of-struct data structures

XMLWordPrintableJSON

    • None
    • Python Drivers
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      As reported in https://github.com/mongodb-labs/mongo-arrow/issues/284#issuecomment-2724806018, with an accompanying PR in https://github.com/mongodb-labs/mongo-arrow/pull/285


      The current implementation of pymongoarrow/schemas.py assings a projection of 'True' to all fields in a struct.
      This does not work for structs that contain list-of-struct datatypes; this can be fixed by reworking the projection creation to use Mongo's dot notation.

      Example:
      read in a data field like
      "a": {
      "b": {
      "c": [
      {
      "event":

      { "$date": "2022-01-01T00:00:00.000Z" }

      ,
      "value": 2,
      }
      ],
      }
      }

      the projection to read this with a schema needs to be

      {"a.b.c.event": True, "a.b.c.value": True}

      The current implementation of the projection returns a projection of
      {"a": {"b": {"c":

      {"event": True}

      }}, which fails as the Mongo client reads "b" as an operator.

              Assignee:
              Steve Silvester
              Reporter:
              Steve Silvester
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: