[SERVER-54752] Version 4.4.4 fails to validate existing certificateKeyFile, refuses to start Created: 24/Feb/21  Updated: 12/Jul/21  Resolved: 12/Jul/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Vlad Lasky Assignee: Varun Ravichandran
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Security 2021-03-22, Security 2021-04-05, Security 2021-04-19, Security 2021-05-03, Security 2021-05-17, Security 2021-05-31, Security 2021-06-14, Security 2021-06-28, Security 2021-07-12
Participants:

 Description   

Hello, I am running MongoDB under CentOS 7, installed from the official mongodb-org-4.4 yum repo.

I ran yum today and it updated mongodb-org.x86_64 0:4.4.3-1.el7 to mongodb-org.x86_64 0:4.4.4-1.el7.

Mongod then failed to restart. It gave me the following startup error, indicating a problem with the existing certificateKeyFile that I know for certain is valid.

Downgrading MongoDB back to mongodb-org.x86_64 0:4.4.3-1.el7 got things working again.

Here are the error messages from the log:

{"t":\{"$date":"2021-02-24T22:03:15.822+11:00"}

,"s":"I", "c":"CONTROL", "id":20698, "ctx":"main","msg":"***** SERVER RESTARTED *****"}

{"t":\{"$date":"2021-02-24T22:03:15.825+11:00"}

,"s":"E", "c":"NETWORK", "id":23252, "ctx":"main","msg":"Cannot use PEM key file","attr":{"keyFile":"/etc/mongod.pem","error":"error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch"}}

{"t":\{"$date":"2021-02-24T22:03:15.826+11:00"}

,"s":"F", "c":"CONTROL", "id":20574, "ctx":"main","msg":"Error during global initialization","attr":{"error":

{"code":140,"codeName":"InvalidSSLConfiguration","errmsg":"Can not set up PEM key file."}

}}



 Comments   
Comment by Varun Ravichandran [ 12/Jul/21 ]

I'm closing this ticket as we were unable to reproduce this issue with the information the reporter provided. vlasky@remotelaboratory.com, if the steps outlined in the previous comments still did not resolve your issue, then please reopen the ticket and provide verbose logs and the subject names of the certificates in the chain in your tls.certificateKeyFile. Thanks!

Comment by Varun Ravichandran [ 29/Jun/21 ]

Hi vlasky@remotelaboratory.com,

I'm just following up to see whether you had a chance to try those steps that I outlined above. I'd greatly appreciate it if you could respond with an update! 

Comment by Varun Ravichandran [ 21/May/21 ]

Hi vlasky@remotelaboratory.com ,
I apologize for the delay in getting back to you! I’ve taken a look, and I’d like to get some clarification on a few issues.
It appears that the error message in the logs is coming from OpenSSL thinking that there is a mismatch in the private key and the public key certificate. Since you have said that this issue has persisted for 2 different certificates and they have worked normally for 4.4.3 and not 4.4.4, I am assuming that the certs/keys have not been corrupted.
I took a look through the changes in our codebase pertaining to our handling of tlsCertificateKeyFile during server standup, and there does not appear to be any major changes there that would logically cause a sudden failure like you are describing. I was also unable to reproduce this even with the configs you provided on a RHEL 7.0 instance.
Based on this, I recommend trying the following steps:
1. Try upgrading the server to the latest version of 4.4 (now 4.4.6) and see if the issue persists. Some of the changes in the latest version of 4.4 have improved logging in the TLS subsystem, which could be useful.

2. Your configuration file, interestingly enough, does not include the security.keyfile option. Starting in 4.4.4, users must provide a path to a keyfile using this option if they have authorization enabled on a replica set (which it appears you do). If you don’t already have that option set, you should create a keyfile and populate that field. See documentation about internal auth here and see more information about the change itself from the ticket.

3. If setting that option and upgrading to 4.4.6 still does not resolve your issue, then please enable verbose logging in your config file and provide a full copy of verbose logs so I can take a deeper dive into those. I would also appreciate it if you could provide the subject names of the certificates that appear in tls.certificateKeyFile , in the order they appear. Feel free to redact the subject name of the leaf (your server’s) certificate if you wish.

I hope these steps are useful, and apologies again for the delay!

Comment by Vlad Lasky [ 31/Mar/21 ]

Hello Varun,

1. These were the configuration options in /etc/mongod.conf:

 

# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# where to write logging data.
systemLog:
 destination: file
 logAppend: true
 path: /var/log/mongodb/mongod.log
# Where and how to store data.
storage:
 dbPath: /var/lib/mongo
 journal:
 enabled: true
# engine:
# wiredTiger:
# how the process runs
processManagement:
 fork: true # fork and run in background
 pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
 timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
 port: 27017
# bindIp: 127.0.0.1 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.
 bindIp: localhost,beta.XXXXXXXX.com.au
 unixDomainSocket:
 filePermissions: 0770
 tls:
 mode: requireTLS
 certificateKeyFile: /etc/mongod.pem
# allowConnectionsWithoutCertificates: true
# allowInvalidHostnames: true
security:
 authorization: "enabled"
#operationProfiling:
replication:
 oplogSizeMB: 10
 replSetName: rs0
#sharding:
## Enterprise-Only Options
#auditLog:
#snmp:

2. It did not crash, so there was no core dump.

3. I'd prefer not to post the server's identifying details on a public forum. I confirm that I tested two different certificates including one generated by the Let's Encrypt cerbot utility. Both certificates worked in version 4.4.3 under my CentOS 7 environment., but not in version 4.4.4.

Comment by Varun Ravichandran [ 26/Mar/21 ]

Hi vlasky@remotelaboratory.com ,
I’m just following up to check whether you saw my previous comment requesting for additional information to help me reproduce the problem you encountered. I’d greatly appreciate it if you could respond whenever you get a chance!
Thank you!

Comment by Varun Ravichandran [ 10/Mar/21 ]

Hi vlasky@remotelaboratory.com ,

Thank you for filing this ticket! I am taking a look into this, but would like some more information. Would you mind providing the following information?

  • The configuration options with which you started the server
  • A core dump if the server crashed on restart
  • A copy of your public X.509 certificate. If you are uncomfortable sharing this, then output of `openssl x509 -text -in <serverCertificateFile.pem>` would also be useful.

Thank you!

Generated at Thu Feb 08 05:34:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.