Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11421

Config server does not invalidate the config.chunks cache

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.4.8, 2.5.3
    • Affects Version/s: 2.4.7, 2.5.3
    • Component/s: Sharding
    • None
    • Environment:
      2 shards cluster, 3 config servers, 1 mongos
    • ALL
    • Hide
      1. Setup a clean 2 shards cluster, 3 config servers, 1 mongos
      2. on mongos:
        db.test.insert({x:1})
        sh.enableSharding('test')
        sh.shardCollection('test.test',{_id : 1})
        
      3. Check the dbhash and chunks count on the first config server:
        db.chunks.count()
        db.runCommand({dbhash:1}).collections.chunks
        
      4. Split a chunk in mongos:
        sh.splitAt("test.test",{_id : new ObjectId()})
        
      5. Verify on config server that chunks count changed, but not the hash:
        db.chunks.count()
        db.runCommand({dbhash:1}).collections.chunks
        
      Show
      Setup a clean 2 shards cluster, 3 config servers, 1 mongos on mongos: db.test.insert({x:1}) sh.enableSharding( 'test' ) sh.shardCollection( 'test.test' ,{_id : 1}) Check the dbhash and chunks count on the first config server: db.chunks.count() db.runCommand({dbhash:1}).collections.chunks Split a chunk in mongos: sh.splitAt( "test.test" ,{_id : new ObjectId()}) Verify on config server that chunks count changed, but not the hash: db.chunks.count() db.runCommand({dbhash:1}).collections.chunks

      Issue Status as of October 30th, 2013

      ISSUE SUMMARY
      With SERVER-11021, the dbhash command results for config servers are cached, until new data is written in those collections. The moveChunk/splitChunk commands use the applyOps command to propagate the changes to the config.chunks collection on the config servers. This causes the cached dbhash for the config.chunks collection to not be updated, and afterwards return the old cached dbhash from before the write.

      USER IMPACT
      This issue is only present in the 2.4.7 (stable release) version of MongoDB. This issue does not affect correctness – the config.chunks collection is written to properly. However, if only one config server is restarted, it can end up with a different dbhash for the config.chunks than the other config servers upon startup. This can prevent new mongos processes from starting until the dbhash for all config servers agree. If the balancer is on, mongos will periodically log a message warning that it has detected that the "config servers differ" and will prevent further migrations from occurring.

      SOLUTION
      Operations applied to the config server collections with the applyOps command need to call logOpForDbHash to invalidate the dbhash cache.

      WORKAROUNDS
      It is safe to downgrade only the config servers to 2.4.6 to avoid this cache invalidation problem.

      PATCHES
      Production release v2.4.8 contains the fix for this issue.

            Assignee:
            dan@mongodb.com Daniel Pasette (Inactive)
            Reporter:
            alex.komyagin@mongodb.com Alexander Komyagin (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: