Uploaded image for project: 'MongoDB Command Line Tools'
  1. MongoDB Command Line Tools
  2. TOOLS-750

MongoDB + WiredTiger + mongodump: difference between the total number of documents exported and restored!

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.0.3
    • Fix Version/s: None
    • Component/s: mongodump
    • Labels:
      None
    • Backport Requested:
      v3.0
    • Sprint:
      Kernel Tools Iteration 4, Kernel Tools Iteration 5

      Description

      Intro

      I have a job exporting a set of documents from collections on a sharded cluster using mongodump to BSON files which are later imported on other type of DBs.

      Since maintaining a strong consistency from our mongodb cluster and the other DBs is important to us, we implemented a three way check on the number of exported documents :

      • count the targeted documents using a normal query
      • parse and count the mongodump exported documents count
      • count the number of imported documents from the BSON files into the other DBs
      • everything is okay when all counts match obviously !

      We are migrating this cluster to 3.0.3. It was doing okay while the primaries and secondaries were still running under MMAPv1 : the number of documents extracted by mongodump was the same as the number found in the bson.

      But since the migration to mongodb 3.0.3 wiredTiger on the SECONDARIES (primaries still run MMAPv1), a difference is visible between the total number of documents than mongodump said to have extracted, and the actual number of records found in the bson!

      Mongodump

      mongodump

      # mongodump --host 'modb-1:27017' --db 'LA_VERITAY' --verbose --collection 'LA_VERITAY' --query '{"seen": {"$gte": {"$date": 1432058400000}, "$lt": {"$date": 1432135349000}}}' --out '/tmp/exports/20150519/LA_VERITAY/LA_VERITAY'
      2015-05-20T17:27:54.035+0200    dumping with 32 job threads
      2015-05-20T17:27:54.035+0200    writing LA_VERITAY.LA_VERITAY to /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson
      2015-05-20T17:27:54.798+0200            4314985 documents
      2015-05-20T17:27:57.035+0200    [........................]  LA_VERITAY.LA_VERITAY  0/4314985  (0.0%)
      2015-05-20T17:28:00.035+0200    [........................]  LA_VERITAY.LA_VERITAY  100/4314985  (0.0%)
      2015-05-20T17:28:06.035+0200    [........................]  LA_VERITAY.LA_VERITAY  161436/4314985  (3.7%)
      [...]
      2015-05-20T17:35:00.035+0200    [#######################.]  LA_VERITAY.LA_VERITAY  4210397/4314985  (97.6%)
      2015-05-20T17:35:01.411+0200    writing LA_VERITAY.LA_VERITAY metadata to /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.metadata.json
      2015-05-20T17:35:01.444+0200    done dumping LA_VERITAY.LA_VERITAY
      2015-05-20T17:35:01.444+0200    done
      

      • mongodump reports exporting 4314985 documents.

      bsondump

      bsondump --objcheck /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson 1> /dev/null

      2015-05-20T19:49:42.547+0200    4314084 objects found
      

      • bsondump reports exporting 4314084 documents.

      Restore (with --objcheck and --drop)

      mongorestore -v --drop --objcheck --host=localhost --port=27017 --db=LA_VERITAY --collection=LA_VERITAY /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson

      2015-05-20T18:34:04.552+0200    using write concern: w='1', j=false, fsync=false, wtimeout=0
      2015-05-20T18:34:04.553+0200    checking for collection data in /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson
      2015-05-20T18:34:04.553+0200    found metadata for collection at /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.metadata.json
      2015-05-20T18:34:04.554+0200    reading metadata file from /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.metadata.json
      2015-05-20T18:34:04.554+0200    no collection options to restore
      2015-05-20T18:34:04.554+0200    restoring LA_VERITAY.LA_VERITAY from file /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson
      2015-05-20T18:34:04.554+0200            file /tmp/exports/20150519/LA_VERITAY/LA_VERITAY/LA_VERITAY/LA_VERITAY.bson is 506070804 bytes
      2015-05-20T18:34:07.553+0200    [##......................]  LA_VERITAY.LA_VERITAY  52.5 MB/482.6 MB  (10.9%)
      2015-05-20T18:34:10.553+0200    [####....................]  LA_VERITAY.LA_VERITAY  96.1 MB/482.6 MB  (19.9%)
      2015-05-20T18:34:13.553+0200    [######..................]  LA_VERITAY.LA_VERITAY  137.0 MB/482.6 MB  (28.4%)
      2015-05-20T18:34:16.553+0200    [########................]  LA_VERITAY.LA_VERITAY  177.6 MB/482.6 MB  (36.8%)
      2015-05-20T18:34:19.553+0200    [##########..............]  LA_VERITAY.LA_VERITAY  216.5 MB/482.6 MB  (44.9%)
      2015-05-20T18:34:22.553+0200    [############............]  LA_VERITAY.LA_VERITAY  254.3 MB/482.6 MB  (52.7%)
      2015-05-20T18:34:25.553+0200    [##############..........]  LA_VERITAY.LA_VERITAY  292.6 MB/482.6 MB  (60.6%)
      2015-05-20T18:34:28.553+0200    [###############.........]  LA_VERITAY.LA_VERITAY  313.9 MB/482.6 MB  (65.0%)
      2015-05-20T18:34:31.553+0200    [################........]  LA_VERITAY.LA_VERITAY  339.9 MB/482.6 MB  (70.4%)
      2015-05-20T18:34:34.553+0200    [##################......]  LA_VERITAY.LA_VERITAY  380.5 MB/482.6 MB  (78.8%)
      2015-05-20T18:34:37.553+0200    [####################....]  LA_VERITAY.LA_VERITAY  409.1 MB/482.6 MB  (84.8%)
      2015-05-20T18:34:40.553+0200    [######################..]  LA_VERITAY.LA_VERITAY  446.3 MB/482.6 MB  (92.5%)
      2015-05-20T18:34:43.553+0200    [#######################.]  LA_VERITAY.LA_VERITAY  468.9 MB/482.6 MB  (97.2%)
      2015-05-20T18:34:44.930+0200    restoring indexes for collection LA_VERITAY.LA_VERITAY from metadata
      2015-05-20T18:45:08.891+0200    finished restoring LA_VERITAY.LA_VERITAY
      2015-05-20T18:45:08.891+0200    done
      

      DB Stats

      db.stats()

      {
          "ns" : "LA_VERITAY.LA_VERITAY",
          "count" : 4314084,
          "size" : 965267776,
          "avgObjSize" : 223,
          "storageSize" : 1164976128,
          "numExtents" : 18,
          "nindexes" : 5,
          "lastExtentSize" : 307535872,
          "paddingFactor" : 1,
          "systemFlags" : 1,
          "userFlags" : 1,
          "totalIndexSize" : 1352318576,
          "indexSizes" : {
              "_id_" : 185914064,
              "xxx_1_yyy_1" : 429640624,
              "yyy_1_xxx_1" : 431284000,
              "hhh_1" : 146121472,
              "jjj_-1" : 159358416
          },
          "ok" : 1
      }
      

      • restored documents count is 4314084 !

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kyle.erf Kyle Erf
                Reporter:
                BTall Babacar Tall
              • Votes:
                4 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: