[SERVER-37565] mongod continuously restarting after upgrade to 4.0.2 Created: 11/Oct/18 Updated: 29/Oct/23 Resolved: 20/Dec/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Security |
| Affects Version/s: | 4.0.2, 4.0.3 |
| Fix Version/s: | 4.0.6, 4.1.7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andrada Nastasie | Assignee: | Patrick Freed |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
|||||||||||||||||||||||||||||||||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.0
|
|||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
This is a trace from the mongodb log, maybe it helps:
|
|||||||||||||||||||||||||||||||||
| Sprint: | Security 2018-12-03, Security 2018-12-17, Security 2018-12-31 | |||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | |||||||||||||||||||||||||||||||||
| Description |
|
Hello, I am trying to upgrade a replicaset from version 3.6.7 to 4.0.2. I started with one of the secondaries, everything was ok in the first 2 minutes after the upgrade but then the server keeps restarting every 2 or 3 minutes. All the members of this replicaset have a lot of connections (~1000) but after the server restarts it gets ~600-700 connections and then it crashes again. I have upgraded a different replicaset which doesn't have this many connections and it worked fine so I think this is the issue here. |
| Comments |
| Comment by Githook User [ 22/Jan/19 ] | ||||||||||||||||||||||||||||||||||||||||
|
Author: {'email': 'patrick.freed@mongodb.com', 'name': 'Patrick Freed', 'username': 'patrickfreed'}Message: This fixes a bug where the server would crash if a large number of parallel connections occurred at once (cherry picked from commit 916a5553a2db8ae7553fea7c3703ef8fef75b055) | ||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 20/Dec/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Author: {'username': 'patrickfreed', 'email': 'patrick.freed@mongodb.com', 'name': 'Patrick Freed'}Message: This fixes a bug where the server would crash if a large number of parallel connections occurred at once | ||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 14/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hi andrada, Glad to hear the issue was resolved by reconfiguring your ulimits. There were some changes to how secure memory is allocated during authentication in 4.0, which caused you to stumble into this issue. We're still discussing the best approach to remedy this behavior, and we'll continue using this ticket to to track this issue until we determine the next steps to improve performance in this space. Thanks, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Andrada Nastasie [ 24/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hi Kelsey, Thank you very much for the recommendation, I did not have the memlock ulimit set appropriately. After I set it correctly, the upgrade worked very well. But tell me, is there any change in mongo 4, because mongo 3.6 and prior worked very well with the default ulimits (64 kb). Thank you, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 19/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hi andrada, Would you please ensure that your ulimits are set appropriately? In particular, I'd like to confirm that your memlock is unlimited. Please review https://docs.mongodb.com/manual/reference/ulimit/index.html#unix-ulimit-settings for additional details regarding these configurations. Thank you, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Andrada Nastasie [ 19/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hello, Do you have any news regarding my question? Thank you, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Andrada Nastasie [ 12/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hello again, I wanted to let you know that I just upgraded another replicaset with ~200 connections per member and everything went just fine. Thank you, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Andrada Nastasie [ 12/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Hello Dan and Ramon, Thank you for your quick answers. I attached you the diagnostic.data and here is a bigger chunk of the log:
| ||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 11/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
andrada, as Dan points out we'll need the full logs: the backtrace above is incomplete and I was unable to symbolize it. If you can also please upload the contents of the diagnostic.data directory for one of the upgraded members that may provide additional information. Thanks, | ||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Pasette (Inactive) [ 11/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
can you attach the complete log or at least include the lines above the backtrace? |