Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-44345

MongoS crash with "BufBuilder attempted to grow()" above 64MB while restarting/upgrading a secondary from 3.4 to 3.6

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.4.17, 3.6.14
    • Fix Version/s: None
    • Component/s: Upgrade/Downgrade
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      (assuming the upgrade was related)
      1.  Have a 3.4 cluster

      2. Upgrade the config servers

      3. Start upgrading the replica sets

      4. Eventually one of them secondary restarts on the replica sets triggers this exception.

      Show
      (assuming the upgrade was related) 1.  Have a 3.4 cluster 2. Upgrade the config servers 3. Start upgrading the replica sets 4. Eventually one of them secondary restarts on the replica sets triggers this exception.

      Description

      We are in the middle of a 3.4.17-evg1 to 3.6.14 upgrade, when one of the mongos servers crashed.

      The most specific message line is this:

      Assertion: 13548:BufBuilder attempted to grow() to 1751919127 bytes, past the 64MB limit. src/mongo/bson/util/builder.h 326

       

      It happened right after a log line where one of our nodes (in the trace attached I've renamed it to <SHARD9_SECONDARY1>) was just starting to shut down as a part of the upgrade process.

       

      This happened less than an hour before I hit submit on this report, so if there are any transient logs our debug output you want me to provide, let me know!

       

      FYI For 3.4.17-evg1, the "-evg1" is just our custom build patched version with 3 logging changes described in the description of this bug https://jira.mongodb.org/browse/SERVER-43021

       

      Note that a few months ago, our 3.6 cluster (we have a different, less high-impact cluster we already have at 3.6) had an issue where something tried to write more than 16MB, and it crahsed all of our mongoS servers in succession.  That bug is herehttps://jira.mongodb.org/browse/SERVER-43021 just in case it's helpful.  We never resolved that bug, but we also never saw the issue again (luckily). 

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              daniel.hatcher Danny Hatcher
              Reporter:
              glajchs Scott Glajch
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: