Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25147

"not master" error from execCommand through mongos

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Cannot Reproduce
    • Affects Version/s: 3.0.8
    • Fix Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Operating System:
      ALL
    • Sprint:
      Sharding 2016-10-10, Sharding 2016-11-21, Sharding 2016-12-12

      Description

      We encounter "not master" error even though the command has been executed against a mongos and all primaries of all (both) shards are up and running.

      This happens for example when trying to delete a database:

      mongos> db.dropDatabase()
      {
      	"ok" : 0,
      	"errmsg" : "{ note: \"from execCommand\", ok: 0.0, errmsg: \"not master\" }"
      }
      

      However, the command may succeed some seconds later in the same mongo shell using the same connection:

      mongos> db.dropDatabase()
      { "dropped" : "loadtest", "ok" : 1 }
      mongos>
      

      The error may also happen when executing mongorestore against the router:

      mongorestore -h mongo-router-01.hotel02.pro00.eu.idealo.com --port 27017 --db kay /data/reise/tmp/dump/import_20160713/
      2016-07-19T15:41:16.055+0200	building a list of collections to restore from /data/reise/tmp/dump/import_20160713/ dir
      2016-07-19T15:41:16.060+0200	Failed: restore error: kay.additional: error reading database: error creating initial database config information :: caused by :: call to listDatabases on reise_shard01/mongo-035.ipx:27017,mongo-036.ipx:27017 failed: { note: "from execCommand", ok: 0.0, errmsg: "not master" }
      

      It may also happen during mongorestore:

      admin@sx724:~$ mongorestore -h mongo-router-01.hotel02.pro00.eu.idealo.com --port 27017 --db kay2 /data/reise/tmp/dump/import_20160713/
      2016-07-19T17:07:04.430+0200	building a list of collections to restore from /data/reise/tmp/dump/import_20160713/ dir
      2016-07-19T17:07:04.731+0200	reading metadata file from /data/reise/tmp/dump/import_20160713/additional.metadata.json
      2016-07-19T17:07:04.731+0200	restoring kay2.additional from file /data/reise/tmp/dump/import_20160713/additional.bson
      2016-07-19T17:07:04.732+0200	reading metadata file from /data/reise/tmp/dump/import_20160713/catalog.metadata.json
      2016-07-19T17:07:04.733+0200	restoring kay2.catalog from file /data/reise/tmp/dump/import_20160713/catalog.bson
      2016-07-19T17:07:04.734+0200	reading metadata file from /data/reise/tmp/dump/import_20160713/offer.metadata.json
      2016-07-19T17:07:04.734+0200	restoring kay2.offer from file /data/reise/tmp/dump/import_20160713/offer.bson
      2016-07-19T17:07:04.735+0200	reading metadata file from /data/reise/tmp/dump/import_20160713/pages.metadata.json
      2016-07-19T17:07:04.736+0200	restoring kay2.pages from file /data/reise/tmp/dump/import_20160713/pages.bson
      2016-07-19T17:07:07.431+0200	[........................]  kay2.additional   32.0 MB/13.5 GB  (0.2%)
      2016-07-19T17:07:07.431+0200	[........................]     kay2.catalog    32.0 MB/3.8 GB  (0.8%)
      2016-07-19T17:07:07.431+0200	[........................]       kay2.offer     8.1 MB/1.8 GB  (0.4%)
      2016-07-19T17:07:07.431+0200	[........................]       kay2.pages  32.0 MB/945.6 MB  (3.4%)
      2016-07-19T17:07:07.431+0200
      2016-07-19T17:07:10.431+0200	[........................]  kay2.additional   32.0 MB/13.5 GB  (0.2%)
      2016-07-19T17:07:10.431+0200	[........................]     kay2.catalog    32.0 MB/3.8 GB  (0.8%)
      2016-07-19T17:07:10.431+0200	[........................]       kay2.offer     8.1 MB/1.8 GB  (0.4%)
      2016-07-19T17:07:10.431+0200	[........................]       kay2.pages  32.0 MB/945.6 MB  (3.4%)
      2016-07-19T17:07:10.431+0200
      2016-07-19T17:07:13.431+0200	[........................]  kay2.additional   32.0 MB/13.5 GB  (0.2%)
      2016-07-19T17:07:13.431+0200	[........................]     kay2.catalog    32.0 MB/3.8 GB  (0.8%)
      2016-07-19T17:07:13.431+0200	[........................]       kay2.offer     8.1 MB/1.8 GB  (0.4%)
      2016-07-19T17:07:13.431+0200	[........................]       kay2.pages  32.0 MB/945.6 MB  (3.4%)
      2016-07-19T17:07:13.431+0200
      2016-07-19T17:07:16.421+0200	error: not master
      2016-07-19T17:07:16.431+0200	[........................]  kay2.additional   32.0 MB/13.5 GB  (0.2%)
      2016-07-19T17:07:16.431+0200	[........................]     kay2.catalog    32.0 MB/3.8 GB  (0.8%)
      2016-07-19T17:07:16.431+0200	[........................]       kay2.offer    17.9 MB/1.8 GB  (1.0%)
      2016-07-19T17:07:16.431+0200	[........................]       kay2.pages  32.0 MB/945.6 MB  (3.4%)
      2016-07-19T17:07:16.431+0200
      2016-07-19T17:07:16.787+0200	error: not master
      2016-07-19T17:07:17.635+0200	error: not master
      2016-07-19T17:07:18.541+0200	error: not master
      2016-07-19T17:07:18.556+0200	error: not master
      2016-07-19T17:07:19.037+0200	error: not master
      2016-07-19T17:07:19.431+0200	[........................]  kay2.additional   95.9 MB/13.5 GB   (0.7%)
      ...
      

      After the restore, some documents are missing if the "not master" error occurred.

      However, executing the failed command "listDatabase" against the replSet by using the same connection string the router reported, works without problems:

      $ mongo --host reise_shard01/mongo-035.ipx:27017,mongo-036.ipx:27017
      MongoDB shell version: 3.0.2
      connecting to: reise_shard01/mongo-035.ipx:27017,mongo-036.ipx:27017/test
      2016-07-19T15:58:52.973+0200 I NETWORK  starting new replica set monitor for replica set reise_shard01 with seeds mongo-035.ipx:27017,mongo-036.ipx:27017
      2016-07-19T15:58:52.973+0200 I NETWORK  [ReplicaSetMonitorWatcher] starting
      reise_shard01:PRIMARY> db.adminCommand({listDatabases:1})
      {
      	"databases" : [
      		{
      			"name" : "loadtest",
      			"sizeOnDisk" : 1466814464,
      			"empty" : false
      		},
      ...
      

      These are our shards:

      mongos> sh.status()
      --- Sharding Status ---
        sharding version: {
      	"_id" : 1,
      	"minCompatibleVersion" : 5,
      	"currentVersion" : 6,
      	"clusterId" : ObjectId("5661b33ff5f2bfb6a339840e")
      }
        shards:
      	{  "_id" : "reise_shard01",  "host" : "reise_shard01/mongo-035.ipx:27017,mongo-036.ipx:27017" }
      	{  "_id" : "reise_shard02",  "host" : "reise_shard02/mongo-037.ipx:27017,mongo-038.ipx:27017" }
        active mongoses:
      	"3.0.8" : 3
      	....
      

      These are the replSets statuses:
      First shard:

      reise_shard01:PRIMARY> rs.status()
      {
      	"set" : "reise_shard01",
      	"date" : ISODate("2016-07-19T16:17:23.053Z"),
      	"myState" : 1,
      	"members" : [
      		{
      			"_id" : 1,
      			"name" : "mongo-036.ipx:27017",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 1063103,
      			"optime" : Timestamp(1468945028, 20),
      			"optimeDate" : ISODate("2016-07-19T16:17:08Z"),
      			"lastHeartbeat" : ISODate("2016-07-19T16:17:21.469Z"),
      			"lastHeartbeatRecv" : ISODate("2016-07-19T16:17:22.490Z"),
      			"pingMs" : 0,
      			"syncingTo" : "mongo-035.ipx:27017",
      			"configVersion" : 7
      		},
      		{
      			"_id" : 2,
      			"name" : "mongo-router-01.hotel02.pro00.eu.idealo.com:27018",
      			"health" : 1,
      			"state" : 7,
      			"stateStr" : "ARBITER",
      			"uptime" : 15576582,
      			"lastHeartbeat" : ISODate("2016-07-19T16:17:21.590Z"),
      			"lastHeartbeatRecv" : ISODate("2016-07-19T16:17:21.662Z"),
      			"pingMs" : 0,
      			"configVersion" : 7
      		},
      		{
      			"_id" : 3,
      			"name" : "mongo-035.ipx:27017",
      			"health" : 1,
      			"state" : 1,
      			"stateStr" : "PRIMARY",
      			"uptime" : 15576941,
      			"optime" : Timestamp(1468945028, 20),
      			"optimeDate" : ISODate("2016-07-19T16:17:08Z"),
      			"electionTime" : Timestamp(1453368132, 1),
      			"electionDate" : ISODate("2016-01-21T09:22:12Z"),
      			"configVersion" : 7,
      			"self" : true
      		}
      	],
      	"ok" : 1,
      	"$gleStats" : {
      		"lastOpTime" : Timestamp(0, 0),
      		"electionId" : ObjectId("56a0a344b5288f9897220e23")
      	}
      }
      

      Second shard:

      reise_shard02:PRIMARY> rs.status()
      {
      	"set" : "reise_shard02",
      	"date" : ISODate("2016-07-19T16:18:28.306Z"),
      	"myState" : 1,
      	"members" : [
      		{
      			"_id" : 0,
      			"name" : "mongo-037.ipx:27017",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 3527123,
      			"optime" : Timestamp(1468945106, 551),
      			"optimeDate" : ISODate("2016-07-19T16:18:26Z"),
      			"lastHeartbeat" : ISODate("2016-07-19T16:18:26.489Z"),
      			"lastHeartbeatRecv" : ISODate("2016-07-19T16:18:27.153Z"),
      			"pingMs" : 0,
      			"syncingTo" : "mongo-038.ipx:27017",
      			"configVersion" : 4
      		},
      		{
      			"_id" : 1,
      			"name" : "mongo-038.ipx:27017",
      			"health" : 1,
      			"state" : 1,
      			"stateStr" : "PRIMARY",
      			"uptime" : 15576995,
      			"optime" : Timestamp(1468945108, 132),
      			"optimeDate" : ISODate("2016-07-19T16:18:28Z"),
      			"electionTime" : Timestamp(1453368159, 1),
      			"electionDate" : ISODate("2016-01-21T09:22:39Z"),
      			"configVersion" : 4,
      			"self" : true
      		},
      		{
      			"_id" : 2,
      			"name" : "mongo-router-02.hotel02.pro00.eu.idealo.com:27018",
      			"health" : 1,
      			"state" : 7,
      			"stateStr" : "ARBITER",
      			"uptime" : 3557101,
      			"lastHeartbeat" : ISODate("2016-07-19T16:18:27.972Z"),
      			"lastHeartbeatRecv" : ISODate("2016-07-19T16:18:26.996Z"),
      			"pingMs" : 0,
      			"configVersion" : 4
      		}
      	],
      	"ok" : 1,
      	"$gleStats" : {
      		"lastOpTime" : Timestamp(0, 0),
      		"electionId" : ObjectId("56a0a35f00d0725cbecc2f18")
      	}
      }
      

        Attachments

        1. mongo-035_1707.log
          30 kB
        2. mongo-035.log
          161 kB
        3. mongo-036_1707.log
          6 kB
        4. mongo-036.log
          41 kB
        5. router_1707.log
          13 kB
        6. router.log
          39 kB

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: