Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63853

$merge breaks fields order. It is critical for bioinformatics

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Done
    • None
    • None
    • Aggregation Framework
    • None
    • MongoDB 5.0.6
      PyMongo 4.0.1
    • ALL

    Description

      A variety of formats require strict adherence to the sequence of fields, such as bioinformatics

      Files of such formats are often very large and contain nested structures, so it is convenient to use them as collections. But to keep the data belonging to the above specs, it is necessary to keep the arrangement of the fields. Unfortunately, aggregations with saving results to another DB lose original arrangement.

      Source document example:

      {
          "_id": {
              "$oid": "620fe1e87fd143aebe55bad4"
          },
          "#CHROM": 1,
          "POS": 88619,
          "ID": "rs573217706",
          "REF": "G",
          "ALT": ["A", "T"],
          "QUAL": ".",
          "FILTER": ".",
          "INFO": [{
                  "RS": 573217706,
                  "RSPOS": 88619,
                  "dbSNPBuildID": 142,
                  "SSR": 0,
                  "SAO": 0,
                  "VP": "0x050100000005040026000100",
                  "WGT": 1,
                  "VC": "SNV",
                  "CAF": [{
                      "$numberDecimal": "0.9988"
                  }, ".", {
                      "$numberDecimal": "0.001198"
                  }],
                  "COMMON": 1,
                  "TOPMED": [{
                      "$numberDecimal": "0.99959384556574923"
                  }, {
                      "$numberDecimal": "0.00000796381243628"
                  }, {
                      "$numberDecimal": "0.00039819062181447"
                  }]
              },
              ["SLO", "ASP", "VLD", "KGPhase3"]
          ]
      }
      

      Part of the aggregation pipeline:

      {'$merge': {'into': {'db': 'test_out', 'coll': 'common_all.vcf'}}}
      

      Result:

       

      {
          "_id": {
              "$oid": "620fe1e87fd143aebe55bad4"
          },
          "#CHROM": 1,
          "ALT": ["A", "T"],
          "FILTER": ".",
          "ID": "rs573217706",
          "INFO": [{
                  "RS": 573217706,
                  "RSPOS": 88619,
                  "dbSNPBuildID": 142,
                  "SSR": 0,
                  "SAO": 0,
                  "VP": "0x050100000005040026000100",
                  "WGT": 1,
                  "VC": "SNV",
                  "CAF": [{
                      "$numberDecimal": "0.9988"
                  }, ".", {
                      "$numberDecimal": "0.001198"
                  }],
                  "COMMON": 1,
                  "TOPMED": [{
                      "$numberDecimal": "0.99959384556574923"
                  }, {
                      "$numberDecimal": "0.00000796381243628"
                  }, {
                      "$numberDecimal": "0.00039819062181447"
                  }]
              },
              ["SLO", "ASP", "VLD", "KGPhase3"]
          ],
          "POS": 88619,
          "QUAL": ".",
          "REF": "G"
      }

      Attachments

        Activity

          People

            eric.sedor@mongodb.com Eric Sedor
            platon.work@gmail.com Platon workaccount
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: