Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16184

repeated WriteBackListener error on out-of-order upgrade

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.8.0-rc2
    • Affects Version/s: 2.8.0-rc0
    • Component/s: Logging, Sharding
    • Labels:
      None
    • Linux

      When upgrading a cluster out-of-order, old v2.6 mongoses do not handle the lack of a WBL command in v2.8 mongods gracefully. There should be as clear an error as possible with information about how to proceed (maybe a link to the docs?).

      Original Description:
      We have a sharded cluster running mongodb v2.6.4 with authentication. Each of the 7 shards has 3 replicaset members. We added one replicaset member running v2.8-rc0 to 3 of the 7 shards without any problem. These 3 v2.8-rc0 replicaset members (called s571, s572, s573) used wiredTiger storage engine, compression set to zlib.
      Besides the 3 confiservers, we also have 3 routers, each running v2.6.4.

      Out of the sky the routers logged every millisecond the same WriteBackListener error. The logged ObjectId was always the same.
      It begun with one of the v2.8 servers (s571). Round about 20 minutes later another v2.8 server was concerned (s572). Another 7 minutes later, also the third v2.8 server (s576) was logged (see logs below).
      Why different shard members can be logged with the same ObjectId?

      Besides these errors, the whole cluster worked well. One of the v2.8 servers was even elected primary.

      The problem was that all routers wrote so many log lines (multiple GB's) that the disk was saturated after a few hours.

      Since we didn't know yet the source of the problem, we removed all three v2.8 replicaset members from the three shards which stopped immediately the massive logging of the routers.

      In mongodb's documentation we've read that it's possible to have a mixed replicaset consisting of v2.6 and v2.8 replSet members. Is that not true for sharding?

      2014-11-16T21:28:30.393+0100 [conn8766077] end connection 172.16.64.13:44839 (3165 connections now open)
      2014-11-16T21:28:30.415+0100 [conn8766043] end connection 172.16.64.13:44737 (3164 connections now open)
      2014-11-16T21:28:30.457+0100 [mongosMain] connection accepted from 172.16.64.233:55931 #8766473 (3165 connections now open)
      2014-11-16T21:28:30.460+0100 [conn8766473]  authenticate db: offerStore { authenticate: 1, user: "pl-shopImp", nonce: "xxx", key: "xxx" }
      2014-11-16T21:28:30.635+0100 [conn8766058] end connection 172.16.64.13:44771 (3164 connections now open)
      2014-11-16T21:28:30.654+0100 [conn8766030] end connection 172.16.64.13:44703 (3163 connections now open)
      2014-11-16T21:28:30.765+0100 [conn8766032] end connection 172.16.64.13:44710 (3162 connections now open)
      2014-11-16T21:28:30.799+0100 [conn8766054] end connection 172.16.64.13:44756 (3161 connections now open)
      2014-11-16T21:28:30.810+0100 [conn8766025] end connection 172.16.64.13:44678 (3160 connections now open)
      2014-11-16T21:28:30.865+0100 [conn8766039] end connection 172.16.64.13:44731 (3159 connections now open)
      2014-11-16T21:28:30.869+0100 [conn8766067] end connection 172.16.64.13:44800 (3158 connections now open)
      2014-11-16T21:28:30.926+0100 [conn8766028] end connection 172.16.64.13:44691 (3157 connections now open)
      2014-11-16T21:28:31.079+0100 [conn8766056] end connection 172.16.64.13:44765 (3156 connections now open)
      2014-11-16T21:28:31.090+0100 [conn8766064] end connection 172.16.64.13:44789 (3155 connections now open)
      2014-11-16T21:28:31.157+0100 [conn8766034] end connection 172.16.64.13:44716 (3154 connections now open)
      2014-11-16T21:28:31.202+0100 [conn8766026] end connection 172.16.64.13:44684 (3153 connections now open)
      2014-11-16T21:28:31.554+0100 [mongosMain] connection accepted from 172.16.65.106:49683 #8766474 (3154 connections now open)
      2014-11-16T21:28:31.557+0100 [conn8766474]  authenticate db: offerStore { authenticate: 1, user: "pl-shopImp", nonce: "xxx", key: "xxx" }
      2014-11-16T21:28:31.563+0100 [conn8765935] end connection 172.16.64.245:52933 (3153 connections now open)
      2014-11-16T21:28:31.851+0100 [conn8766078] end connection 172.16.64.13:44841 (3152 connections now open)
      2014-11-16T21:28:31.922+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.922+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.923+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.924+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.924+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.925+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.926+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:28:31.926+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      ...
      2014-11-16T21:47:07.477+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.478+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.479+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.480+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.480+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.481+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.481+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.481+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.482+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.482+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.482+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:47:07.483+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      ...
      2014-11-16T21:54:18.923+0100 [WriteBackListener-s573:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:54:18.923+0100 [WriteBackListener-s572:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:54:18.923+0100 [WriteBackListener-s571:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      2014-11-16T21:54:18.923+0100 [WriteBackListener-s573:27017] writebacklisten command failed!  { ok: 0.0, errmsg: "no such cmd: writebacklisten", code: 59, bad cmd: { writebacklisten: ObjectId('5436835a68ed2a107cf9b3cf') } }
      

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            kay.agahd@idealo.de Kay Agahd
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: