Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22310

movePrimary command should refresh existing mongos metadata caches

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Sharding
    • Sharding

    Description

      Original description

      The movePrimary command is not refreshing existing mongos metadata caches resulting in inconsistent query runs.

      Steps to Reproduce
      1. Deployed 2 shard cluster with each shard running a replica set of 3 nodes. I also started 2 mongos.
      2. Created a database mydb on mongos1 and this is what sh.status returned:

        --- Sharding Status --- 
          sharding version: {
            "_id": 1,
            "minCompatibleVersion": 5,
            "currentVersion": 6,
            "clusterId": ObjectId("56a7671754ad7a545f803203")
          }
          shards:
            {  "_id": "shard01",  "host": "shard01/ankit:27018,ankit:27019,ankit:27020" }
            {  "_id": "shard02",  "host": "shard02/ankit:27021,ankit:27022,ankit:27023" }
          balancer:
        	Currently enabled:  yes
        	Currently running:  no
        	Failed balancer rounds in last 5 attempts:  0
        	Migration Results for the last 24 hours: 
        		No recent migrations
          databases:
            {  "_id": "admin",  "partitioned": false,  "primary": "config" }
            {  "_id": "test",  "partitioned": false,  "primary": "shard01" }
            {  "_id": "mydb",  "partitioned": false,  "primary": "shard01" }
        

      3. Inserted a sample document:

        ankit(mongos-3.0.8)[mongos] mydb> db.mycoll.find()
        {
          "_id": ObjectId("56a767ef88eb8c1294029482")
        }
        

      4. Explain output from mongos1 (Notice the request going to shard1)

        ankit(mongos-3.0.8)[mongos] mydb1> db.mycoll1.find().explain()
        {
          "queryPlanner": {
            "mongosPlannerVersion": 1,
            "winningPlan": {
              "stage": "SINGLE_SHARD",
              "shards": [
                {
                  "shardName": "shard01",
                  "connectionString": "shard01/ankit:27018,ankit:27019,ankit:27020",
                  "serverInfo": {
                    "host": "ankit",
                    "port": 27018,
                    "version": "3.0.8",
                    "gitVersion": "83d8cc25e00e42856924d84e220fbe4a839e605d"
                  },
                  "plannerVersion": 1,
                  "namespace": "mydb.mycoll",
        

      5. Explain output from mongos2 (The request correctly going to shard1)

        ankit:27025(mongos-3.0.8)[mongos] mydb> db.mycoll.find().explain()
        {
          "queryPlanner": {
            "mongosPlannerVersion": 1,
            "winningPlan": {
              "stage": "SINGLE_SHARD",
              "shards": [
                {
                  "shardName": "shard01",
                  "connectionString": "shard01/ankit:27018,ankit:27019,ankit:27020",
                  "serverInfo": {
                    "host": "ankit",
                    "port": 27018,
                    "version": "3.0.8",
                    "gitVersion": "83d8cc25e00e42856924d84e220fbe4a839e605d"
                  },
                  "plannerVersion": 1,
                  "namespace": "mydb.mycoll",
        

      6. Used the movePrimary command to move mydb to shard2.
      7. sh.status output from mongos1

          databases:
            {  "_id": "admin",  "partitioned": false,  "primary": "config" }
            {  "_id": "test",  "partitioned": false,  "primary": "shard01" }
            {  "_id": "mydb",  "partitioned": false,  "primary": "shard02" }
        

      8. sh.status output from mongos2

          databases:
            {  "_id": "admin",  "partitioned": false,  "primary": "config" }
            {  "_id": "test",  "partitioned": false,  "primary": "shard01" }
            {  "_id": "mydb",  "partitioned": false,  "primary": "shard02" }
        

      9. Explain output from mongos1 (Request correctly going to shard2 now)

        nkit(mongos-3.0.8)[mongos] mydb> db.mycoll.find().explain()
        {
          "queryPlanner": {
            "mongosPlannerVersion": 1,
            "winningPlan": {
              "stage": "SINGLE_SHARD",
              "shards": [
                {
                  "shardName": "shard02",
                  "connectionString": "shard02/ankit:27021,ankit:27022,ankit:27023",
                  "serverInfo": {
                    "host": "ankit",
                    "port": 27021,
                    "version": "3.0.8",
                    "gitVersion": "83d8cc25e00e42856924d84e220fbe4a839e605d"
                  },
                  "plannerVersion": 1,
                  "namespace": "mydb.mycoll",
        

      10. Explain output from mongos2 (Request STILL going to shard1)

        ankit:27025(mongos-3.0.8)[mongos] mydb> db.mycoll.find().explain()
        {
          "queryPlanner": {
            "mongosPlannerVersion": 1,
            "winningPlan": {
              "stage": "SINGLE_SHARD",
              "shards": [
                {
                  "shardName": "shard01",
                  "connectionString": "shard01/ankit:27018,ankit:27019,ankit:27020",
                  "serverInfo": {
                    "host": "ankit",
                    "port": 27018,
                    "version": "3.0.8",
                    "gitVersion": "83d8cc25e00e42856924d84e220fbe4a839e605d"
                  },
                  "plannerVersion": 1,
                  "namespace": "mydb.mycoll",
        

        Also, note that no results were returned from mongos2 find()

      11. Further, I manually created mydb database (and inserted a sample document) directly on mongod of shard1. This emulated the case where 2 databases with same name were sitting on different shards i.e. show dbs returning mydb on both the shards.

        ankit:27018(mongod-3.0.8)[PRIMARY:shard01] mydb1> use mydb
        switched to db mydb
        ankit:27018(mongod-3.0.8)[PRIMARY:shard01] mydb> db.mycoll.insert({})
        Inserted 1 record(s) in 139ms
        

      12. Running find query on mongos2 again went to shard1 but brought the document that was created in above step.

      Further, I was able to fix this by clearing the metadata cache on mongos2. For this purpose, I used flushRouterConfig command:

      db.adminCommand("flushRouterConfig")
      

      Or, restarting mongos also fixes this issue.

      Attachments

        Activity

          People

            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            ankit.kakkar@mongodb.com Ankit Kakkar
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: