Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15879

Better error reporting when the mongos configdb string does not match with cached one



    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Fix
    • Affects Version/s: 2.6.0
    • Fix Version/s: None
    • Component/s: Diagnostics, Sharding
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Sharding 2 04/24/15, Sharding 3 05/15/15, Sharding 4 06/05/15, Sharding 5 06/26/16, Sharding 6 07/17/15, Sharding 7 08/10/15, Sharding 8 08/28/15, Sharding 9 (09/18/15), Sharding A (10/09/15), Sharding B (10/30/15)


      In case the config db string specified by mongos does not match the one cached in config db, there are no errors / warning in the logs at the default log level (in MongoDB 2.6 onwards). This causes the cluster that is partially impaired not being detected so until a metadata write operation starts failing on the cluster. Following is a log snippet from default log-level with a very vague error that possibly misleads in a wrong dimension:

      21T22:00:21.460+0000 [conn2] addshard request { addShard: "testshard/examplehost" } failed: could not verify config servers were active and reachable before write
      2014-10-21T22:02:07.275+0000 [conn2] end connection (0 connections now open)
      2014-10-21T22:40:05.079+0000 [mongosMain] connection accepted from #3 (1 connection now open)

      Even when the impact is noted, it is not clear what is causing it. This is only noticeable when higher log level is turned on for mongos. Following are the logs when higher log level on mongos is turned on (logLevel = 2).

      2014-10-25T04:46:59.871+0000 [Balancer] calling onCreate auth for host1.cluster1.example.com:27019 (
      2014-10-25T04:46:59.872+0000 [Balancer] initializing shard connection to host1.cluster1.example.com:27019 (
      2014-10-25T04:46:59.872+0000 [Balancer] initial sharding settings : { setShardVersion: "", init: true, configdb: "host1.cluster1.example.com:27019,host2.cluster1.example.com:27019,host3.cluster1.example.com:27019", serverID: ObjectId('544a936356680b7002b9a101'), authoritative: true }
      2014-10-25T04:46:59.872+0000 [Balancer] initial sharding result : { configdb: { stored: "cfg1.tmp.example.com:27019,cfg2.tmp.example.com:27019,cfg3.tmp.example.com:27019", given: "host1.cluster1.example.com:27019,host2.cluster1.example.com:27019,host3.cluster1.example.com:27019" }, ok: 0.0, errmsg: "mongos specified a different config database string : stored : cfg1.tmp.example.com:27019,cfg2.tmp.example.com:27019,cfg3.tmp.example.com:27019 vs given ..." }
      2014-10-25T04:46:59.872+0000 [Balancer] User Assertion: 15907:could not initialize sharding on connection host1.cluster1.example.com:27019 ( :: caused by :: mongos specified a different config database string : stored : cfg1.tmp.example.com:27019,cfg2.tmp.example.com:27019,cfg3.tmp.example.com:27019 vs given : host1.cluster1.example.com:27019,host2.cluster1.example.com:27019,host3.cluster1.example.com:27019
      2014-10-25T04:46:59.872+0000 [Balancer] creating new connection to:host2.cluster1.example.com:27019
      2014-10-25T04:46:59.872+0000 [ConnectBG] BackgroundJob starting: ConnectBG
      2014-10-25T04:46:59.873+0000 [Balancer] connected to server host2.cluster1.example.com:27019 (

      Since, this impacts the working of a cluster, the error / assertion should be logged at the default log level.


          Issue Links



              kevin.pulo Kevin Pulo
              anil.kumar Anil Kumar
              10 Vote for this issue
              9 Start watching this issue