[SERVER-44345] MongoS crash with "BufBuilder attempted to grow()" above 64MB while restarting/upgrading a secondary from 3.4 to 3.6 Created: 31/Oct/19 Updated: 11/Dec/19 Resolved: 05/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Upgrade/Downgrade |
| Affects Version/s: | 3.4.17, 3.6.14 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Scott Glajch | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | (assuming the upgrade was related) 2. Upgrade the config servers 3. Start upgrading the replica sets 4. Eventually one of them secondary restarts on the replica sets triggers this exception. |
||||||||
| Participants: | |||||||||
| Description |
|
We are in the middle of a 3.4.17-evg1 to 3.6.14 upgrade, when one of the mongos servers crashed. The most specific message line is this: Assertion: 13548:BufBuilder attempted to grow() to 1751919127 bytes, past the 64MB limit. src/mongo/bson/util/builder.h 326
It happened right after a log line where one of our nodes (in the trace attached I've renamed it to <SHARD9_SECONDARY1>) was just starting to shut down as a part of the upgrade process.
This happened less than an hour before I hit submit on this report, so if there are any transient logs our debug output you want me to provide, let me know!
FYI For 3.4.17-evg1, the "-evg1" is just our custom build patched version with 3 logging changes described in the description of this bug https://jira.mongodb.org/browse/SERVER-43021
Note that a few months ago, our 3.6 cluster (we have a different, less high-impact cluster we already have at 3.6) had an issue where something tried to write more than 16MB, and it crahsed all of our mongoS servers in succession. That bug is herehttps://jira.mongodb.org/browse/SERVER-43021 just in case it's helpful. We never resolved that bug, but we also never saw the issue again (luckily). |
| Comments |
| Comment by Githook User [ 11/Dec/19 ] |
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: (typo: that git commit should have said |
| Comment by Danny Hatcher (Inactive) [ 05/Nov/19 ] |
|
I believe this is being worked on in |
| Comment by Scott Glajch [ 01/Nov/19 ] |
|
This happened again today, with the same looking stack trace. We are still in the process of restarting our mongod nodes and again it coincided with one of them coming down for upgrade. |