Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15265

passes >= maxPasses (capped collection)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.8
    • Component/s: Replication
    • None
    • ALL
    • Hide

      I'm unable to reproduce

      Show
      I'm unable to reproduce

      We have a 5 node mongo replica set, and after a network outage, all our apps flushed their data to the mongo master. The master stayed up, and replicated across to the other nodes, but at a specific point, 3 of the replica set members crashed, and we were unable to recover them from that state. Fortunately we were able to restore the nodes from the remaining nodes.

      I've attached log files from the master.

      A bit of context on what happened when we started the mongo node up. The journal recovered, but then the i/o went through the roof, and we saw the bgsync not keeping up when running with -vvvv. When the fatal exception happens, it's always for the same query on the same capped collection which I assume it's trying to replay from the master. I confirmed that the same query is where the other nodes also break. There is nothing special about the query, it is a small blob with very few fields, and the capped collection has many just like it.

      The OS is Ubuntu 10.04.3 LTS. FS is XFS. 22Gb memory.

            Assignee:
            Unassigned Unassigned
            Reporter:
            duncanphillips Duncan Phillips
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: