Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15499

Confusion with error message "Assertion: 13110:HostAndPort: host is empty" when all data bearing nodes in shard are down

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor - P4
    • Resolution: Cannot Reproduce
    • Affects Version/s: 2.4.10
    • Fix Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Environment:
      OS: Linux ip 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux
      MongoDB: 2.4.10
      EC2 m3.medium for all {{MongoDB}} instances except arbiter services which ran on t1.micro
    • Operating System:
      Linux
    • Steps To Reproduce:
      Hide

      Setup 2 shards across EC2 N.Virginia and Ireland

      EU-West with 1P 1A (replicaset default-s1) 1S (replicaset foo-s1)
      EU-West also ran 1 config server and 1 mongos
      US-East with 1P 1A (replicaset foo-s1) 1S (replicaset default-s1)
      MongoDB: 2.4.10
      OS: Linux ip 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux

      In EU-West I killed using SIGTERM the following:
      1P from replicaset default-s1
      In US-East I killed using SIGTERM the following:
      1S from replicaset default-s1

      On the Mongos via the MongoShell then enter:

      mongos> db.test.insert({"name":"frank"})
      HostAndPort: host is empty
      mongos> db.test.insert({"name":"frank"})
      socket exception [CONNECT_ERROR] for default-s1/ec2-54-242-126-41.compute-1.amazonaws.com:27017,ec2-54-74-94-63.eu-west-1.compute.amazonaws.com:27017

      Show
      Setup 2 shards across EC2 N.Virginia and Ireland EU-West with 1P 1A (replicaset default-s1) 1S (replicaset foo-s1) EU-West also ran 1 config server and 1 mongos US-East with 1P 1A (replicaset foo-s1) 1S (replicaset default-s1) MongoDB: 2.4.10 OS: Linux ip 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux In EU-West I killed using SIGTERM the following: 1P from replicaset default-s1 In US-East I killed using SIGTERM the following: 1S from replicaset default-s1 On the Mongos via the MongoShell then enter: mongos> db.test.insert({"name":"frank"}) HostAndPort: host is empty mongos> db.test.insert({"name":"frank"}) socket exception [CONNECT_ERROR] for default-s1/ec2-54-242-126-41.compute-1.amazonaws.com:27017,ec2-54-74-94-63.eu-west-1.compute.amazonaws.com:27017

      Description

      Hi All,

      There is some confusion with the error message:

      ] Assertion: 13110:HostAndPort: host is empty
      0xa8d781 0xa5385b 0x69310f 0x6ad4b4 0x69b911 0x9f93c4 0x9f27b5 0x9f584a 0x9a743a 0x9c939f 0x9cbf19 0x99a9c1 0x666844 0xa79d4e 0x7fd53eac4b50 0x7fd53de67e6d
       mongos(_ZN5mongo15printStackTraceERSo+0x21) [0xa8d781]
       mongos(_ZN5mongo11msgassertedEiPKc+0x9b) [0xa5385b]
       mongos(_ZN5mongo16ConnectionString12_fillServersESs+0x3df) [0x69310f]
       mongos(_ZN5mongo16ConnectionStringC1ENS0_14ConnectionTypeERKSsS3_+0x84) [0x6ad4b4]
       mongos(_ZN5mongo16ConnectionString5parseERKSsRSs+0x2b1) [0x69b911]
       mongos(_ZN5mongo17WriteBackListener4initERNS_12DBClientBaseE+0x1e4) [0x9f93c4]
       mongos(_ZN5mongo17checkShardVersionEPNS_12DBClientBaseERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEEbi+0x45) [0x9f27b5]
       mongos(_ZN5mongo14VersionManager19checkShardVersionCBEPNS_15ShardConnectionEbi+0x6a) [0x9f584a]
       mongos(_ZN5mongo15ShardConnection11_finishInitEv+0xfa) [0x9a743a]
       mongos(_ZN5mongo13ShardStrategy7_insertERKSsRNS_9DbMessageEiRNS_7RequestE+0x45f) [0x9c939f]
       mongos(_ZN5mongo13ShardStrategy7writeOpEiRNS_7RequestE+0x509) [0x9cbf19]
       mongos(_ZN5mongo7Request7processEi+0xd1) [0x99a9c1]
       mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x74) [0x666844]
       mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xa79d4e]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7fd53eac4b50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fd53de67e6d]
      

      In the scenario where all of the data bearing nodes within a specific shard are down, this assertion message can cause confusion. Is it possible to refactor the error message from "HostAndPort: host is empty" for cases where all data bearing nodes in a replicaset are down/unavailable ?

      Thanks!
      Eoin

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-sharding Backlog - Sharding Team
              Reporter:
              eoin.brazil Eoin Brazil
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: