Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8405

sharded count may incorrectly count migrating or orphaned documents (does not filter using chunk manager)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Performance, Sharding
    • Labels:
      None
    • ALL
    • Hide

      Run this to trigger repeated migrations:

      var st = new ShardingTest( { shards: 2,
                                   mongos:1,
                                   other : { chunksize : 1 }});
      
      st.stopBalancer();
      
      db = st.getDB( "test" );
      var testcoll = db.foo;
      
      st.adminCommand( { enablesharding : "test" } );
      st.adminCommand( { shardcollection : "test.foo" , key : { _id : 1 } } );
      
      var str = "";
      while ( str.length < 10000 ) {
          str += "asdasdsdasdasdasdas";
      }
      
      var data = num = 0;
      
      //Insert till you get to 10MB of data
      while ( data < ( 1024 * 1024 * 10 ) ) {
          testcoll.insert( { _id : num++ , s : str } )
          data += str.length
      }
      
      //Flush and wait
      db.getLastError()
      
      var stats = st.chunkCounts( "foo" )
      var to1 = "";
      var to2 = "";
      for ( shard in stats ){
          if ( stats[shard] == 0 ) {
              to1 = shard
          }
          else {
              to2 = shard;
          }
      }
      
      while( 1 ) {
      
          var result = st.adminCommand( { movechunk : "test.foo" ,
                                          find : { _id : 1 } ,
                                          to : to1,
                                          _waitForDelete : true} );
      
          assert(result, "movechunk failed: " + tojson( result ) );
      
          sleep( 100 );
      
          var result = st.adminCommand( { movechunk : "test.foo" ,
                                          find : { _id : 1 } ,
                                          to : to2,
                                          _waitForDelete : true} );
      
          assert(result, "movechunk failed: " + tojson( result ) );
      
          sleep( 100 );
      }
      

      Run this in a separate shell attached to the mongos started by the script above:

      while( 1 ) { count = db.foo.count(); print( count );  assert.eq( 1048, count ); }
      

      Note per the following that queries return correct counts even during migrations:

      while( 1 ) { count = db.foo.find().itcount(); print( count );  assert.eq( 1048, count ); }
      
      Show
      Run this to trigger repeated migrations: var st = new ShardingTest( { shards: 2, mongos:1, other : { chunksize : 1 }}); st.stopBalancer(); db = st.getDB( "test" ); var testcoll = db.foo; st.adminCommand( { enablesharding : "test" } ); st.adminCommand( { shardcollection : "test.foo" , key : { _id : 1 } } ); var str = ""; while ( str.length < 10000 ) { str += "asdasdsdasdasdasdas" ; } var data = num = 0; //Insert till you get to 10MB of data while ( data < ( 1024 * 1024 * 10 ) ) { testcoll.insert( { _id : num++ , s : str } ) data += str.length } //Flush and wait db.getLastError() var stats = st.chunkCounts( "foo" ) var to1 = ""; var to2 = ""; for ( shard in stats ){ if ( stats[shard] == 0 ) { to1 = shard } else { to2 = shard; } } while ( 1 ) { var result = st.adminCommand( { movechunk : "test.foo" , find : { _id : 1 } , to : to1, _waitForDelete : true } ); assert (result, "movechunk failed: " + tojson( result ) ); sleep( 100 ); var result = st.adminCommand( { movechunk : "test.foo" , find : { _id : 1 } , to : to2, _waitForDelete : true } ); assert (result, "movechunk failed: " + tojson( result ) ); sleep( 100 ); } Run this in a separate shell attached to the mongos started by the script above: while ( 1 ) { count = db.foo.count(); print( count ); assert .eq( 1048, count ); } Note per the following that queries return correct counts even during migrations: while ( 1 ) { count = db.foo.find().itcount(); print( count ); assert .eq( 1048, count ); }

      Generally a shard only has documents belonging to chunks owned by that shard. However, if a migration is in progress the shard may additionally have documents belonging to a chunk that is not owned by the shard but is in the process of being migrated. After an aborted migration, a shard may have "orphaned" documents belonging to a chunk that never completed migration.

      When a query (not count) runs on a sharded cluster, each shard checks the documents that match the query to see if they also belong to chunks owned by that shard. If they do not belong to owned chunks, they are not returned.

      However when a count runs, the check to see if matching documents belong to owned chunks is not performed. This implementation has the advantage that sharded counts can use covered indexes. (Checking whether a document belongs to a valid chunk cannot currently be done from an index only - it requires loading the full document - see SERVER-5022.) However it means sharded counts can return incorrect results. The counts on the individual shards don't filter out unowned documents that may exist because of in flight migrations or aborted migrations.

      Here are some potential ways of handling this, to promt discussion:

      • Filter count results using a chunk manager, the same as we currently do for queries.
      • Give a mongod chunk manager knowledge of whether or not there may be an in flight migration or orphaned documents, and filter using a chunk manager only when necessary.
      • For indexes that include the shard key, add covered index support for checking chunk membership, otherwise fall back to checking the full document for chunk membership.
      • Track disk locations of documents participating in migrations (including aborted migrations) and exclude these documents from count, query, etc results. Filtering based on disk location does not require reading the document. If the list of disk locs is in memory only, orphans might be deleted on startup.

            Assignee:
            Unassigned Unassigned
            Reporter:
            aaron Aaron Staple
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: