Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-9251

Documentation for BI-368: mongodrdl should be able to generate tables for collections with arrays that are not "pre-joined"

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Done
    • Icon: Critical - P2 Critical - P2
    • None
    • None
    • BI Connector
    • None

    Description

      Documentation Changes

      Include sample SQL query.

      Description

      Engineering Ticket Description:

      Currently mongodrdl generates "pre-joined" tables for each field of type array in a collection. For users of relational models though, it would make more sense in many cases to not pre-join. So for a collection containing documents like:

      {_id : 1, name : "jeff", tags : ["dog", "cat"}] }
      

      It could generate DRDL like this instead:

      schema:
      - db: test
        tables:
        - table: users
          collection: users
          pipeline: []
          columns:
          - Name: _id
            MongoType: float64
            SqlName: _id
            SqlType: numeric
          - Name: name
            MongoType: string
            SqlName: name
            SqlType: varchar
        - table: users_tags
          collection: users
          pipeline:
          - $unwind:
              includeArrayIndex: tags_idx
              path: $tags
          columns:
          - Name: _id
            MongoType: float64
            SqlName: _id
            SqlType: numeric
          - Name: tags
            MongoType: string
            SqlName: tags
            SqlType: varchar
          - Name: tags_idx
            MongoType: int
            SqlName: tags_idx
            SqlType: numeric
      

      Note that the users_tags table contains the _id field for joining on, but not the name field. With this structure, users would just join these two tables:

      select u.*, t.tags from users_tags t join users u on t._id = u._id where tags = 'dog'
      

      This would also make it easier to write queries where more than two tables from the same collection need to be joined together.

      For users who are either relying on the current behavior in 1.x and are upgrading to 2.x, mongodrdl will provide a --preJoin option that preserves the current behavior.

      Attachments

        Activity

          People

            andrew.aldridge@mongodb.com Andrew Aldridge
            emily.hall Emily Hall
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              7 years, 13 weeks ago