Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5514

Data not balanced accross all the shards

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.1
    • Component/s: Stability
    • Labels:
      None
    • Environment:
      Linux
    • Linux

      I have 7 shards and 1 Router and 1 Config server. All the shards and servers are running on VMs.One of the shards is allocated excessive data. This is the same machine that was acting up and exhausted all the diskspace in the previous issue that I have raised before. We deleted all the data and config and started on clean slate.
      Could there be any problem with this machine in having affinity for taking more data- If so, what could be the reason?

      I have raised a related issue which details the history of this issue
      https://jira.mongodb.org/browse/SERVER-5433
      Thanks
      **************************************
      OUTPUTS for STATS
      ***********************************
      mongostat for the machine thats acting up(lrchb00363)

      insert  query update delete getmore command flushes mapped  vsize    res faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
           4      0      7      0       0      15       0  5.95g  12.5g  4.21g      0     0.2          0       0|0     0|0     4k     4k   127   09:31:56
          13      0      9      0       0      23       0  5.95g  12.5g   4.2g      0     0.4          0       0|0     0|0    10k     3k   127   09:31:57
      

      Observation look at the mapped and vsize 5.95 and 12.5
      ##################################
      mongostat for the machine thats normal(lrchb00363)

      insert  query update delete getmore command flushes mapped  vsize    res faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
           9      0      6      0       0      16       0  1.95g  4.42g  1.48g      1     1.3          0       0|0     0|0     6k     3k   129   09:33:24
          16      0     14      0       0      31       0  1.95g  4.42g  1.48g      0     0.6          0       0|0     0|0    13k     4k   129   09:33:25
      

      Observation: look at the mapped and vsize 1.95 and 4.42
      All the other shards have similiar mapped and vsizes, except lrchb00319.
      I am thinking this odd shard would take up all the data and then stop functioning as it runs out of diskspace. Please comment.

      *****************************
      Output of db.stats()
      *****************************

      {
              "raw" : {
                      "LRCHB00319:40001" : {
                              "db" : "audit",
                              "collections" : 9,
                              "objects" : 1094648,
                              "avgObjSize" : 1018.3885486476017,
                              "dataSize" : 1114776988,
                              "storageSize" : 1346035712,
                              "numExtents" : 80,
                              "indexes" : 20,
                              "indexSize" : 233449328,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00362:40004" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 904531,
                              "avgObjSize" : 1135.2065810900897,
                              "dataSize" : 1026829544,
                              "storageSize" : 1104818176,
                              "numExtents" : 33,
                              "indexes" : 12,
                              "indexSize" : 213589824,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00363:40005" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 1239329,
                              "avgObjSize" : 1268.8326473438449,
                              "dataSize" : 1572501096,
                              "storageSize" : 4183908352,
                              "numExtents" : 56,
                              "indexes" : 12,
                              "indexSize" : 290534160,
                              "fileSize" : 8519680000,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00364:40006" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 893827,
                              "avgObjSize" : 1169.2595547013013,
                              "dataSize" : 1045115760,
                              "storageSize" : 1076518912,
                              "numExtents" : 45,
                              "indexes" : 12,
                              "indexSize" : 210908096,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00365:40002" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 848153,
                              "avgObjSize" : 1184.0515048582035,
                              "dataSize" : 1004256836,
                              "storageSize" : 1167663104,
                              "numExtents" : 50,
                              "indexes" : 12,
                              "indexSize" : 201129600,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00366:40003" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 891586,
                              "avgObjSize" : 1092.6191169444114,
                              "dataSize" : 974163908,
                              "storageSize" : 1135775744,
                              "numExtents" : 37,
                              "indexes" : 12,
                              "indexSize" : 211300544,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      },
                      "LRCHB00374:40007" : {
                              "db" : "audit",
                              "collections" : 5,
                              "objects" : 1013103,
                              "avgObjSize" : 1202.7945075673451,
                              "dataSize" : 1218554724,
                              "storageSize" : 1347837952,
                              "numExtents" : 38,
                              "indexes" : 12,
                              "indexSize" : 240954896,
                              "fileSize" : 4226809856,
                              "nsSizeMB" : 16,
                              "ok" : 1
                      }
              },
              "objects" : 6885177,
              "avgObjSize" : 1155.5547309822246,
              "dataSize" : 7956198856,
              "storageSize" : 11362557952,
              "numExtents" : 339,
              "indexes" : 92,
              "indexSize" : 1601866448,
              "fileSize" : 33880539136,
              "ok" : 1
      

        1. config.zip
          19 kB
        2. mongos.log_20120405_953am.txt
          12 kB
        3. printShardingStatus.txt
          6 kB

            Assignee:
            Unassigned Unassigned
            Reporter:
            preethamraj Preetham Derangula
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: