[CDRIVER-2846] OpenSSL 1.1.1 compatibility Created: 07/Oct/18  Updated: 28/Oct/23  Resolved: 16/Nov/18

Status: Closed
Project: C Driver
Component/s: tls
Affects Version/s: None
Fix Version/s: 1.14.0

Type: New Feature Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to CDRIVER-2853 Support TLSv1.3 with LibreSSL Backlog
Epic Link: FY2019Q4 Quick Wins (C and CXX)

 Description   

Our Evergreen Archlinux images just upgraded to OpenSSL 1.1.1, revealing at least 3 issues:

  • Our ASN1_STRING_get0_data config check in CMake is broken and doesn't detect that the new function is available.
  • Our test certs are signed with SHA1. I've heard this is now prohibited but can't find OpenSSL docs that say so; anyway we must regenerate them with SHA256.
  • The mock server tests fail instantly now, even with new certs. TLS handshake succeeds and the client can send isMaster to the mock server, but it gets some error reading the mock server's reply.

Update: The test certs are ok for now, the original report that OpenSSL 1.1.1 bans SHA1 signatures was wrong, it's actually a Debian policy that bans them. I've opened DRIVERS-575 to regenerate them, but that's not the bug here.

Some clues about the mock server test failures. OpenSSL 1.1.1 supports TLS 1.3:

"TLSv1.3 sends more non-application data records after the handshake is finished. At least the session ticket and possibly a key update is send after the finished message. With TLSv1.2 it happened in case of renegotiation. SSL_read() has always documented that it can return SSL_ERROR_WANT_READ after processing non-application data, even when there is still data that can be read. When SSL_MODE_AUTO_RETRY is set using SSL_CTX_set_mode() OpenSSL will try to process the next record, and so not return SSL_ERROR_WANT_READ while it still has data available. Because many applications did not handle this properly, SSL_MODE_AUTO_RETRY has been made the default. If the application is using blocking sockets and SSL_MODE_AUTO_RETRY is enabled, and select() is used to check if a socket is readable this results in SSL_read() processing the non-application data records, but then try to read an application data record which might not be available and hang."

The C Driver isn't hanging with blocking sockets, it's getting an error with non-blocking sockets. Still, the problem must be in this vicinity, because disabling TLS 1.3 in _mongoc_openssl_ctx_new with "SSL_CTX_set_max_proto_version (ctx, TLS1_2_VERSION)" makes the bug disappear.

 



 Comments   
Comment by Githook User [ 16/Nov/18 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: CDRIVER-2846 OpenSSL 1.1.1 compatibility

OpenSSL 1.1.1 supports TLSv1.3, and the docs say "TLSv1.3 sends more
non-application data records after the handshake is finished." When the async
loop detects a socket is readable, the available data might be non-application
data and BIO_read returns 0. Update the async code to keep trying to read in
this scenario.

Also fix the CMake config check for ASN1_STRING_get0_data, and avoid
deprecation warnings with OpenSSL 1.1.0+ from the old ASN1_STRING_data.

Finally, if we test build an old OpenSSL locally on a system where 1.1.1 is the
default, we must keep the old OpenSSL out of CMake's own LD path or it will
fail to start.
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/7c416045fce45cbdfc46acc9d0def98294b35549

Comment by A. Jesse Jiryu Davis [ 14/Nov/18 ]

... on repeated testing these discrepancies are noise. There's a 100-millisecond timeout within the mock server "accept()" loop, I bet that causes a race that can sometimes add 100 ms to the test duration - but I observe that both with and without my OpenSSL 1.1.1 fix in place. The tests run within the same range of durations with and without this fix.

Comment by A. Jesse Jiryu Davis [ 13/Nov/18 ]

So far, the Ubuntu 14.04 "/Client*" tests are the same speed as the Archlinux TLS v1.3 tests except for:

Test: Ubuntu duration, Archlinux duration
/Client/rs_seeds_no_connect/pooled: 0.298137 0.397505
/Client/rs_seeds_connect/single: 0.299025 0.398155
/Client/mongos_seeds_no_connect/pooled: 0.29807 0.39731
/Client/mongos_seeds_connect/pooled: 0.299938 0.398329 

Comment by A. Jesse Jiryu Davis [ 12/Nov/18 ]

Done - with a server built against OpenSSL 1.1.1 the old driver fails to connect, as expected, and the new one succeeds. I want to compare the runtimes of the "/Client*" tests with and without TLS v1.3 and the patch: this change risks trying too hard to read from a closed or failed connection, and comparing test durations seems like a good place to find new bugs in that area.

Comment by A. Jesse Jiryu Davis [ 07/Nov/18 ]

The code review is rough, but it seems to have approximately the right solution for OpenSSL 1.1.1. So far I think LibreSSL, SChannel, and Secure Transport don't support TLS v1.3 and don't need the analogous fix yet. We don't have a server with TLS v1.3 support available in Evergreen yet, so before I close this ticket I should build one on Arch Linux and test the old and new driver code against it.

Generated at Wed Feb 07 21:16:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.