Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23425

Inserts and updates during chunk migration get deleted in 3.0.9, 3.0.10

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.9, 3.0.10
    • Fix Version/s: 3.0.11
    • Component/s: Sharding
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL

      Description

      Issue Status as of Mar 31, 2016

      ISSUE SUMMARY
      During chunk migrations, insert and update operations affecting data within a migrating chunk are not reflected to the recipient shard, resulting in data loss.

      USER IMPACT
      Only the following deployments are affected by this issue:

      • Sharded clusters where shards run MongoDB versions 3.0.9 or 3.0.10, and
      • The balancer is enabled or manual chunk migrations are performed

      Standalone nodes, replica set deployments, and sharded clusters with no chunk migrations are not impacted by this issue. No other version of MongoDB is affected.

      During a chunk migration, insert and update operations affecting documents in the migrating chunk are not reflected in the recipient shard, leading to data loss.

      Users who haven’t disabled the moveParanoia option should be able to recover this data manually.

      WORKAROUNDS
      Neither MongoDB 3.2 nor MongoDB 3.0.8 and earlier are affected by this issue. Users on affected versions should upgrade to 3.0.11 or newer, 3.2.4 or newer as soon as possible.

      Alternatively, users should disable the balancer and ensure no manual chunk migrations occur in order to avoid this issue. The balancer can be disabled cluster-wide or on a per-collection basis. See the Documentation section below for more information.

      AFFECTED VERSIONS
      MongoDB versions 3.0.9 and 3.0.10, only.

      FIX VERSION
      The fix is included in the 3.0.11 production release.

      DOCUMENTATION

      Original description

      Similar to SERVER-22535, if I insert documents while a migration is happening, those documents seem to get lost.

      The script below inserts 20000 documents into a collection. Then manually moves a chunk while inserting another 20000 documents. The end asserts that there should be 40000 documents in the collection, but in my testing there are 20-30 documents missing.

      var LOG_FUNCTION = "function log(msg) {var date = new Date(); jsTest.log('MONGOISSUE - ' + date.getHours() + ':' + date.getMinutes() + ':' + date.getSeconds() + ':' + date.getMilliseconds() + ' - ' + msg);}";
      eval(LOG_FUNCTION);
       
      var numDocs = 20000;
       
      // Set up cluster.
      log('SETTING UP CLUSTER...');
      var st = new ShardingTest({shards: 2, other: {shardOptions: {storageEngine: 'mmapv1', verbose: 0}}});
      var s = st.s0;
      var d1 = st.shard1;
      var coll = s.getDB("test").foo;
      assert.commandWorked(s.adminCommand({enableSharding: coll.getDB().getName()}));
      assert.commandWorked(s.adminCommand({shardCollection: coll.getFullName(), key: {_id: "hashed"}}));
      log('INSERT START');
      for (i=0; i<numDocs; i++) {
          coll.insert({_id: i});
      }
      log('INSERT END');
      assert.commandWorked(coll.ensureIndex({a: 1}));
       
      // Check document count.
      var count = coll.find().itcount();
      log("DOC COUNT: " + count);
      assert.eq(numDocs, count);
       
      // Configure server to increase reproducibility.
      assert.commandWorked(d1.adminCommand({setParameter: 1, internalQueryExecYieldIterations: 2}));
       
      function logChunk(chunk) { log('chunk ' + chunk['_id'] + ' shard ' + chunk['shard']); }
      st.config.chunks.find().forEach(logChunk);
       
      // Initiate migration and add data in parallel.
      shell = startParallelShell(LOG_FUNCTION + " log('INSERT START'); var coll = db.getSiblingDB('test').foo; for (i=" + numDocs + "; i<" + (numDocs * 2) + "; i++) { coll.insert({_id: i}); }; log('INSERT END');", s.port);
      sleep(500);
      log('MOVECHUNK START');
      var res = s.adminCommand({moveChunk: coll.getFullName(), find: {_id: 0}, to: "shard0000", _waitForDelete: true});
      log('MOVECHUNK END');
      assert.commandWorked(res);
      st.config.chunks.find().forEach(logChunk);
      shell();
       
      // Re-check document count.
      var count = coll.find().itcount();
      log("DOC COUNT: " + count);
      assert.eq(numDocs * 2, count);
      

      I've reproduced this both in mongo 3.0.9 and 3.0.10.

      Update

      MongoDB 3.2 is not affected by this bug

        Issue Links

          Activity

          Hide
          ramon.fernandez Ramon Fernandez added a comment -

          Thanks for the detailed reproducer David Andrade, we're investigating.

          Show
          ramon.fernandez Ramon Fernandez added a comment - Thanks for the detailed reproducer David Andrade , we're investigating.
          Hide
          schwerin Andy Schwerin added a comment - - edited

          How important is it to adjust the yield iterations server parameter in order to make this reproduce?

          Show
          schwerin Andy Schwerin added a comment - - edited How important is it to adjust the yield iterations server parameter in order to make this reproduce?
          Hide
          dandrade@agoragames.com David Andrade added a comment - - edited

          I was able to reproduce it even if i commented out that line.

          Show
          dandrade@agoragames.com David Andrade added a comment - - edited I was able to reproduce it even if i commented out that line.
          Hide
          ramon.fernandez Ramon Fernandez added a comment -

          David Andrade, this is to let you know we've identified the source of the issue and are working on a fix. Please note that MongoDB 3.2 is not affected by this bug, so if this issue is critical for you you may want to consider upgrading to 3.2 (3.2.4 is the latest stable release at the time of this writing).

          Thanks,
          Ramón.

          Show
          ramon.fernandez Ramon Fernandez added a comment - David Andrade , this is to let you know we've identified the source of the issue and are working on a fix. Please note that MongoDB 3.2 is not affected by this bug , so if this issue is critical for you you may want to consider upgrading to 3.2 (3.2.4 is the latest stable release at the time of this writing). Thanks, Ramón.
          Hide
          dandrade@agoragames.com David Andrade added a comment -

          Can you confirm what version of 3.0 this bug was introduced in?

          Show
          dandrade@agoragames.com David Andrade added a comment - Can you confirm what version of 3.0 this bug was introduced in?
          Hide
          schwerin Andy Schwerin added a comment -

          This bug affects 3.0.9 and 3.0.10 only.

          Show
          schwerin Andy Schwerin added a comment - This bug affects 3.0.9 and 3.0.10 only.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

          Message: SERVER-23425 Correctly track inserts and deletes to migrating chunks.
          Branch: v3.0.11
          https://github.com/mongodb/mongo/commit/48f8b49dc30cc2485c6c1f3db31b723258fcbf39

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'} Message: SERVER-23425 Correctly track inserts and deletes to migrating chunks. Branch: v3.0.11 https://github.com/mongodb/mongo/commit/48f8b49dc30cc2485c6c1f3db31b723258fcbf39
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

          Message: SERVER-23425 Correctly track inserts and deletes to migrating chunks.
          Branch: v3.0
          https://github.com/mongodb/mongo/commit/3ce338f6fc95322141bbf35f982513a831bb74ca

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'} Message: SERVER-23425 Correctly track inserts and deletes to migrating chunks. Branch: v3.0 https://github.com/mongodb/mongo/commit/3ce338f6fc95322141bbf35f982513a831bb74ca
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

          Message: SERVER-23425 Port 3.2 sharding move chunk unit tests
          Branch: v3.0
          https://github.com/mongodb/mongo/commit/3edc84475b10154a76f268edb5e80ac6ca609411

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'} Message: SERVER-23425 Port 3.2 sharding move chunk unit tests Branch: v3.0 https://github.com/mongodb/mongo/commit/3edc84475b10154a76f268edb5e80ac6ca609411

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: