Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25629

Accidental "host:port/replicaSetName"-format mongos --configdb arg should be rejected asap.

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Trivial - P5 Trivial - P5
    • 3.4.0-rc0
    • Affects Version/s: 3.3.11
    • Component/s: Sharding
    • None
    • Fully Compatible
    • Sharding 2016-08-29, Sharding 2016-09-19, Sharding 2016-10-10

      When I begin a mongos with a 3.3.11 test cluster I run the following mongos command.

      akira:~$ /usr/local/bin/mongodb-linux-x86_64-3.3.11/bin/mongos --fork --logpath data/mongos.log --configdb akira-macbookpro:27019 --port 30000
      BadValue: configdb supports only replica set connection string
      try '/usr/local/bin/mongodb-linux-x86_64-3.3.11/bin/mongos --help' for more information
      

      So the default/compulsory replica set-style config server has come into play and I can't use a plain host list anymore. That is fine.

      It's also good that the error message points to --help, and it has the right syntax

      Sharding options:
        --configdb arg                   Connection string for communicating with 
                                         config servers:
                                         <config replset name>/<host1:port>,<host2:po
                                         rt>,[...]
      

      ... but I did it backwards! I.e. I wrote "--configdb <host>[,<host>]*/<replsetName>" instead of the correct "<replsetName>/ <host>[,<host>]*"), probably going from my memory regarding mongodb connection uris.

      When I do this the mongos starts, implicitly affirming I got the configdb argument format right.

      akira:~$ mongos --fork --logpath mongos2.log --configdb akira-macbookpro:27019/cfgrs --port 30000
      2016-08-16T14:25:19.460+1000 W SHARDING [main] Running a sharded cluster with fewer than 3 config servers should only be done for testing purposes and is not recommended for production.
      about to fork child process, waiting until server is ready for connections.
      forked process: 4539
      <hangs there>
      

      But it hangs without finishing the fork. Meanwhile in the log file the problem is reported like thus.

      2016-08-16T14:25:19.464+1000 I SHARDING [mongosMain] mongos version v3.3.11-30-gc96009e
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] git version: c96009ecd439bbd960ae1c01d6379e64ecdb5eeb
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] allocator: tcmalloc
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] modules: none
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] build environment:
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain]     distarch: x86_64
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain]     target_arch: x86_64
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] db version v3.3.11-30-gc96009e
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] git version: c96009ecd439bbd960ae1c01d6379e64ecdb5eeb
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] allocator: tcmalloc
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] modules: none
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] build environment:
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain]     distarch: x86_64
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain]     target_arch: x86_64
      2016-08-16T14:25:19.464+1000 I CONTROL  [mongosMain] options: { net: { port: 30000 }, processManagement: { fork: true }, sharding: { configDB: "akira-macbookpro:27019/cfgrs" }, systemLog: { destination: "file", path: "/tmp/mongos2.log" } }
      2016-08-16T14:25:19.485+1000 I NETWORK  [mongosMain] Starting new replica set monitor for akira-macbookpro:27019/cfgrs:27019
      2016-08-16T14:25:19.485+1000 I SHARDING [thread1] creating distributed lock ping thread for process akira-macbookpro:30000:1471321519:7538175379671250174 (sleeping for 30000ms)
      2016-08-16T14:25:19.490+1000 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] getaddrinfo("cfgrs") failed: Name or service not known
      2016-08-16T14:25:19.490+1000 W NETWORK  [ReplicaSetMonitor-TaskExecutor-0] No primary detected for set akira-macbookpro:27019
      2016-08-16T14:25:19.490+1000 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] All nodes for set akira-macbookpro:27019 are down. This has happened for 1 checks in a row.
      2016-08-16T14:25:19.994+1000 I NETWORK  [replSetDistLockPinger] getaddrinfo("cfgrs") failed: Name or service not known
      2016-08-16T14:25:19.994+1000 W NETWORK  [replSetDistLockPinger] No primary detected for set akira-macbookpro:27019
      2016-08-16T14:25:19.994+1000 I NETWORK  [replSetDistLockPinger] All nodes for set akira-macbookpro:27019 are down. This has happened for 2 checks in a row.
      2016-08-16T14:25:20.498+1000 I NETWORK  [mongosMain] getaddrinfo("cfgrs") failed: Name or service not known
      2016-08-16T14:25:20.498+1000 W NETWORK  [mongosMain] No primary detected for set akira-macbookpro:27019
      2016-08-16T14:25:20.498+1000 I NETWORK  [mongosMain] All nodes for set akira-macbookpro:27019 are down. This has happened for 3 checks in a row.
      2016-08-16T14:25:21.002+1000 I NETWORK  [mongosMain] getaddrinfo("cfgrs") failed: Name or service not known
      2016-08-16T14:25:21.002+1000 W NETWORK  [mongosMain] No primary detected for set akira-macbookpro:27019
      

      The line "No primary detected for set akira-macbookpro:27019" is the first one that caught my eye, and I took it to mean something such as the replica set on my config server had not been initialized, or had lost a majority, etc.

      That the "<host>:<port>" string had been misinterpreted as a replicaset name didn't occur to me for quite a while.

      So I request some sanity-checking for the --configdb argument.

            Assignee:
            schwerin@mongodb.com Andy Schwerin
            Reporter:
            akira.kurogane Akira Kurogane
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: