2.0.6 server crashed when movechunk failed because a config server was down

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Minor - P4
    • None
    • Affects Version/s: 2.0.6
    • Component/s: Internal Code
    • None
    • Environment:
      Ubuntu 10.4, mongo 2.0.6. 8 single replica set servers, 3 config servers, multiple mongos
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      We have 8 servers as single replica sets. This is because we can lose the data at any time and its okay. We just start over. Its a caching system.

      3 config servers
      8 single replica sets
      multiple mongos (some where 1.8.5 which have since been upgraded to 2.0.7).

      We were moving a config server to another location.
      The move occurred in the middle of a movechunk.
      The movechunk failed.
      One of the data replica set servers crashed because of it.

      Right around the crash, we had lots of these because of the config server that was offline.

      Wed Sep 5 13:52:31 [conn31392] waiting till out of critical section
      Wed Sep 5 13:52:31 [conn31392] waiting till out of critical section
      Wed Sep 5 13:52:31 [conn31392] waiting till out of critical section

      Then

      Wed Sep 5 13:52:37 [conn31375] waiting till out of critical section
      Wed Sep 5 13:52:37 [conn31383] waiting till out of critical section
      Wed Sep 5 13:52:37 [conn27754] ERROR: moveChunk commit failed: version is at32299|1 instead of 32300|1
      Wed Sep 5 13:52:37 [conn27754] ERROR: TERMINATING
      Wed Sep 5 13:52:37 dbexit:
      Wed Sep 5 13:52:37 [conn27754] shutdown: going to close listening sockets...
      Wed Sep 5 13:52:37 [conn27754] closing listening socket: 6
      Wed Sep 5 13:52:37 [conn27754] closing listening socket: 7
      Wed Sep 5 13:52:37 [conn27754] closing listening socket: 9
      Wed Sep 5 13:52:37 [conn27754] removing socket file: /tmp/mongodb-27017.sock
      Wed Sep 5 13:52:37 [conn27754] shutdown: going to flush diaglog...
      Wed Sep 5 13:52:37 [conn27754] shutdown: going to close sockets...
      Wed Sep 5 13:52:37 [conn27754] shutdown: waiting for fs preallocator...
      Wed Sep 5 13:52:37 [conn31369] waiting till out of critical section
      Wed Sep 5 13:52:37 [conn1] end connection 127.0.0.1:54322
      Wed Sep 5 13:52:37 [conn244] end connection 10.5.5.165:40494
      Wed Sep 5 13:52:37 [conn243] end connection 10.5.5.165:40493
      Wed Sep 5 13:52:37 [conn31337] waiting till out of critical section
      Wed Sep 5 13:52:37 [conn31375] waiting till out of critical section
      Wed Sep 5 13:52:37 [conn31362] end connection 10.5.5.121:39824
      Wed Sep 5 13:52:37 [conn31337] waiting till out of critical section
      Wed Sep 5 13:52:37 [initandlisten] now exiting
      Wed Sep 5 13:52:37 dbexit: ; exiting immediately
      Wed Sep 5 13:52:37 [conn30251] end connection 10.5.5.40:54631
      Wed Sep 5 13:52:37 [conn27754] shutdown: lock for final commit...

              • SERVER RESTARTED *****

      Wed Sep 5 14:07:45 [initandlisten] MongoDB starting : pid=17294 port=27017 dbpath=/var/lib/mongodb 64-bit hos
      t=jeroshard08
      Wed Sep 5 14:07:45 [initandlisten] db version v2.0.6, pdfile version 4.5
      Wed Sep 5 14:07:45 [initandlisten] git version: e1c0cbc25863f6356aa4e31375add7bb49fb05bc
      Wed Sep 5 14:07:45 [initandlisten] build info: Linux ip-10-110-9-236 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri No
      v 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
      Wed Sep 5 14:07:45 [initandlisten] options:

      { config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", direct oryperdb: "true", journal: "true", logappend: "true", logpath: "/var/log/mongodb/mongodb.log", replSet: "j-h", rest: "true" }

      Wed Sep 5 14:07:45 [initandlisten] journal dir=/var/lib/mongodb/journal
      Wed Sep 5 14:07:45 [initandlisten] recover begin

      and the recovery took place and it was fine.

            Assignee:
            Spencer Brody (Inactive)
            Reporter:
            Mark N
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: