[SERVER-35125] null pointer read access violation in SSLHandshakeManager::doServerHandshake Created: 21/May/18  Updated: 29/Oct/23  Resolved: 12/Jul/18

Status: Closed
Project: Core Server
Component/s: Security
Affects Version/s: None
Fix Version/s: 4.0.1

Type: Bug Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: Mark Benvenuto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongod.log    
Issue Links:
Backports
Depends
depends on CDRIVER-2658 Save minidump file if mongod crashes ... Closed
Duplicate
is duplicated by SERVER-36012 Crash in doServerHandshake Closed
is duplicated by SERVER-36088 Replica set connection strings trigge... Closed
Problem/Incident
causes SERVER-36088 Replica set connection strings trigge... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Participants:

 Description   

Seen in Evergreen C Driver tests. Standalone MongoDB v3.7.9-343-g6ab1592260 on Windows, the C Driver is built with TLS support from Microsoft's SChannel. It's attempting to send an isMaster command to test some details of the client's async SSL implementation. It crashes the server with:

2018-05-21T16:23:21.853+0000 I CONTROL  [conn9] *** unhandled exception (access violation) at 0x000007FEFB9ECBA0, terminating
2018-05-21T16:23:21.853+0000 I CONTROL  [conn9] *** access violation was a read from 0x0
2018-05-21T16:23:21.853+0000 I CONTROL  [conn9] *** stack trace for unhandled exception:
2018-05-21T16:23:22.131+0000 I - [conn9] VCRUNTIME140.dll memcpy+0x336x
mongod.exe ...\src\mongo\util\net\ssl\detail\impl\schannel.ipp(330) asio::ssl::detail::SSLHandshakeManager::doServerHandshake+0x589x
mongod.exe ...\src\mongo\util\net\ssl\detail\impl\schannel.ipp(97) asio::ssl::detail::SSLHandshakeManager::nextHandshake+0x256x
mongod.exe ...\src\mongo\util\net\ssl\detail\impl\engine_schannel.ipp(96) asio::ssl::detail::engine::handshake+0x52x
mongod.exe ...\src\mongo\util\net\ssl\detail\buffered_handshake_op.hpp(63) asio::ssl::detail::buffered_handshake_op<asio::mutable_buffers_1>::process<asio::mutable_buffer const * __ptr64>+0x57x
mongod.exe ...\src\mongo\util\net\ssl\detail\io.hpp(34) asio::ssl::detail::io<asio::basic_stream_socket<asio::generic::stream_protocol>,asio::ssl::detail::buffered_handshake_op<asio::mutable_buffers_1> >+0x96x
mongod.exe ...\src\mongo\transport\session_asio.h(595) <lambda_f6f2e1bffb89ba5b7d46b59118d1310c>::operator()+0x112x
mongod.exe ...\src\mongo\transport\session_asio.h(601) mongo::transport::TransportLayerASIO::ASIOSession::maybeHandshakeSSLForIngress<asio::mutable_buffers_1>+0x486x
mongod.exe ...\src\mongo\transport\session_asio.h(388) <lambda_6ebb069d1bfd5721769cd5bb5f598637>::operator()+0x39x
mongod.exe ...\src\mongo\util\future.h(103) mongo::future_details::callVoidOrStatus<<lambda_6ebb069d1bfd5721769cd5bb5f598637> & __ptr64>+0x31x
mongod.exe ...\src\mongo\util\future.h(124) mongo::future_details::call<<lambda_6ebb069d1bfd5721769cd5bb5f598637> & __ptr64>+0x32x
mongod.exe ...\src\mongo\util\future.h(194) mongo::future_details::throwingCall<<lambda_6ebb069d1bfd5721769cd5bb5f598637> & __ptr64,mongo::future_details::FakeVoid,mongo::future_details::Future<bool>,void>+0x26x
mongod.exe ...\src\mongo\util\future.h(863) <lambda_3a26267831b8c0e29fc9fbe284e77d06>::operator()+0x59x
mongod.exe ...\src\mongo\util\future.h(1067) mongo::future_details::Future<mongo::future_details::FakeVoid>::generalImpl<<lambda_3a26267831b8c0e29fc9fbe284e77d06>,<lambda_5f4bdc0e7c9c03f7610f77bb2780f4ea>,<lambda_4c2f342b8bbe34f98907388364936ebe> >+0x38x
mongod.exe ...\src\mongo\util\future.h(859) mongo::future_details::Future<mongo::future_details::FakeVoid>::then<<lambda_6ebb069d1bfd5721769cd5bb5f598637>,mongo::future_details::Future<bool>,void,bool>+0x44x
mongod.exe ...\src\mongo\util\future.h(1219) mongo::future_details::Future<void>::then<<lambda_6ebb069d1bfd5721769cd5bb5f598637> >+0x14x
mongod.exe ...\src\mongo\transport\session_asio.h(385) mongo::transport::TransportLayerASIO::ASIOSession::read<asio::mutable_buffers_1>+0x256x
mongod.exe ...\src\mongo\transport\session_asio.h(337) mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl+0x190x
mongod.exe ...\src\mongo\transport\session_asio.h(131) mongo::transport::TransportLayerASIO::ASIOSession::sourceMessage+0x85x
mongod.exe ...\src\mongo\transport\service_state_machine.cpp(247) <lambda_a9e5333abbb985a176b6aaeb009d3a08>::operator()+0x84x
mongod.exe ...\src\mongo\transport\service_state_machine.cpp(254) mongo::ServiceStateMachine::_sourceMessage+0x174x
mongod.exe ...\src\mongo\transport\service_state_machine.cpp(436) mongo::ServiceStateMachine::_runNextInGuard+0x242x
mongod.exe ...\src\mongo\transport\service_state_machine.cpp(479) <lambda_b23af5efc3b61ab25bff0c3bcd13382b>::operator()+0x109x
mongod.exe ...\src\mongo\transport\service_executor_synchronous.cpp(133) <lambda_472996f9e6b00ec91d31b43a6cde81f7>::operator()+0x217x
mongod.exe c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(247) std::_LaunchPad<std::unique_ptr<std::tuple<std::function<void __cdecl(void)> >,std::default_delete<std::tuple<std::function<void __cdecl(void)> > > > >::_Run+0x133x
mongod.exe c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(210) std::_Pad::_Call_func+0x9x
ucrtbase.DLL crt_at_quick_exit+0x125x
kernel32.dll BaseThreadInitThunk+0x13x
2018-05-21T16:23:22.131+0000 I CONTROL [conn9] writing minidump diagnostic file c:\mongodb\bin\mongod.2018-05-21T16-23-22.mdmp

The log file as attached. The minidump, unfortunately, is lost. The only notable thing about this test is that the client does not do SNI, whereas I believe it normally does with SChannel.

https://evergreen.mongodb.com/task/mongo_c_driver_windows_2015_test_latest_server_auth_sspi_winssl_d50598260706785d16469882416599e52d087828_18_05_21_15_59_43



 Comments   
Comment by Githook User [ 13/Jul/18 ]

Author:

{'username': 'markbenvenuto', 'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com'}

Message: SERVER-35125 Add additional SECBUFFER_EXTRA validation

(cherry picked from commit dbd844703ad91562c9fe921b6911f57c29a3f0ea)
Branch: v4.0
https://github.com/mongodb/mongo/commit/876be28eee5230939773091871e24e6cdb599141

Comment by Githook User [ 12/Jul/18 ]

Author:

{'username': 'markbenvenuto', 'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com'}

Message: SERVER-35125 Add additional SECBUFFER_EXTRA validation
Branch: master
https://github.com/mongodb/mongo/commit/dbd844703ad91562c9fe921b6911f57c29a3f0ea

Comment by Andrew Morrow (Inactive) [ 12/Jul/18 ]

Undocumented but potentially UB causing behavior in a crypto/TLS library feels like something we ought to make an effort to report upstream.

Comment by Mark Benvenuto [ 12/Jul/18 ]

I have no idea if this is a bug in Microsoft's SChannel or a simply unclear documentation. For an unknown reason, Schannel AcceptSecurityContext will leave the SECBUFFER_EXTRA with a pvBuffer as null but a cbBuffer >0.  I do not have a deterministic repro so it hard to figure out the criteria to trigger this issue. The C-Driver occasionally fails when doing a full test run so I have no idea how to make a test case. I had to commit mdmp/core dump collection to the C-Driver to help debug this.

Comment by Mark Benvenuto [ 11/Jul/18 ]

As discussed in an earlier comment, the issue is that inputBuffers[1] is set inconsistently. It has pvBuffer == nullptr but cbBuffer == 7. This was confirmed with minidumps from the crashes.

In this case, the original input to AcceptSecurityContext is 176 bytes in inputBuffers[0]. When AcceptSecurityContext completes, it leaves the input buffers in the following state:

0:031> ?? inputBufferDesc.pBuffers[0]
struct _SecBuffer
   +0x000 cbBuffer         : 0xb0
   +0x004 BufferType       : 2
   +0x008 pvBuffer         : 0x00000000`025dd710 Void
0:031> ?? inputBufferDesc.pBuffers[1]
struct _SecBuffer
   +0x000 cbBuffer         : 7
   +0x004 BufferType       : 5
   +0x008 pvBuffer         : (null) 

It is the second one that is confusing since all the data has been consumed by the first input buffer. The fix is to enhance the condition to ignore the SECBUFFER_EXTRA when pvBuffer is nullptr.

Comment by A. Jesse Jiryu Davis [ 09/Jul/18 ]

Here's another:

https://evergreen.mongodb.com/task/mongo_c_driver_windows_2010_32_test_4.0_server_auth_nosasl_winssl_3f9339663b556abcbeeeb616a7420283d794bb70_18_07_09_14_38_20

Comment by Andrew Morrow (Inactive) [ 09/Jul/18 ]

mark.benvenuto - Please note another instance of this in SERVER-36012.

Comment by Mark Benvenuto [ 25/May/18 ]

I do not understand how this could happen. I have tried to run the failing tests against release and debug builds of mongod with window heap checks enabled via gflags.exe but I have not been able to repro the original issue.

If I examine the assembly code, the failing code is this line:

        std::copy(reinterpret_cast<const std::uint8_t*>(data),
                  reinterpret_cast<const std::uint8_t*>(data) + length,
                  _buffer.get() + originalSize);

which is called by

        _pExtraEncryptedBuffer->append(inputBuffers[1].pvBuffer, inputBuffers[1].cbBuffer);

This std::copy gets translated into memmove in release builds, and this then failed on the instruction below:

0:001> u vcruntime140!memcpy+0n336
VCRUNTIME140!MoveSmall+0xb9 [f:\dd\vctools\crt\vcruntime\src\string\amd64\memcpy.asm @ 269]:
000007fe`fb6bcba0 8b0a            mov     ecx,dword ptr [rdx]
000007fe`fb6bcba2 440fb74204      movzx   r8d,word ptr [rdx+4]
000007fe`fb6bcba7 440fb64a06      movzx   r9d,byte ptr [rdx+6]
000007fe`fb6bcbac 8908            mov     dword ptr [rax],ecx
000007fe`fb6bcbae 6644894004      mov     word ptr [rax+4],r8w
000007fe`fb6bcbb3 44884806        mov     byte ptr [rax+6],r9b
000007fe`fb6bcbb7 c3              ret

This is the assembly code for moving 7 bytes. This means that the pointer to the extra buffer returned by SChannel was NULL in pvBuffer, but it still set cbBuffer to 7. This does not seem possible.

Comment by A. Jesse Jiryu Davis [ 24/May/18 ]

Right, controlled by environment variables. Set the following env vars:

export MONGOC_TEST_USER=bob
export MONGOC_TEST_PASSWORD=pwd123
export MONGOC_TEST_SSL_WEAK_CERT_VALIDATION=on
export MONGOC_TEST_SSL_PEM_FILE=src/libmongoc/tests/x509gen/client.pem
export MONGOC_TEST_SSL_CA_FILE=src/libmongoc/tests/x509gen/ca.pem
./src/libmongoc/Debug/test-libmongoc.exe -d -l "/Async/*"

Passing the "-l" argument to test-libmongoc.exe will select the tests that crashed the server.

See CONTRIBUTING.md in the C Driver repo for more info on running tests. Note that we rearranged the repo a couple days ago so the libmongoc source, tests, and certs are now under src/libmongoc/.

Comment by Mark Benvenuto [ 23/May/18 ]

jesse How do I run the test program locally such that I run the tests in the same way? Are these settings controlled via environment variables? I know that the test does not fail reliably but I want to run it locally anyway with more checks, etc.

From the test logs, this is a snippet of the relevant settings:

[2018/05/21 12:22:58.598]   "auth": { "user": "bob", "pass": "pwd123" },
[2018/05/21 12:22:58.598]   "addr": { "host": "localhost", "port": 27017, "uri": "" },
[2018/05/21 12:22:58.598]   "gssapi": { "host": "", "user": "" },
[2018/05/21 12:22:58.598]   "uds": "%2Ftmp%2Fmongodb-27017.sock",
[2018/05/21 12:22:58.598]   "compressors": "",
[2018/05/21 12:22:58.598]   "SSL": {
[2018/05/21 12:22:58.598]     "enabled": true,
[2018/05/21 12:22:58.598]     "weak_cert_validation": true,
[2018/05/21 12:22:58.598]     "pem_file": "src/libmongoc/tests/x509gen/client.pem",
[2018/05/21 12:22:58.598]     "pem_pwd": "",
[2018/05/21 12:22:58.598]     "ca_file": "src/libmongoc/tests/x509gen/ca.pem",
[2018/05/21 12:22:58.598]     "ca_dir": "",
[2018/05/21 12:22:58.598]     "crl_file": ""
[2018/05/21 12:22:58.598]   },

Comment by Spencer Jackson [ 22/May/18 ]

The crash happens in this code, where the call to append is on 330:

    if (inputBuffers[1].BufferType == SECBUFFER_EXTRA) {
        _pExtraEncryptedBuffer->reset();
        _pExtraEncryptedBuffer->append(inputBuffers[1].pvBuffer, inputBuffers[1].cbBuffer);
    }

reset was called successfully, which implies that _pExtraEncryptedBuffer points to valid memory. The crash took place in a memcpy, and was caused by a read on NULL. So either the ReusableBuffer that _pExtraEncryptedBuffer points at contained a NULL buffer, or inputBuffers[1].pvBuffer was NULL. ReusableBuffer::append performs two memory copy operations. First it resizes its buffer, with a memcpy_s to populate the new buffer, then uses std::copy to load the new bytes into the new buffer.

memcpy_s should execute a callback if either of its arguments are NULL, which may terminate the program. I'm not sure what the behavior of the default callback for our binaries does though. Asking around std::copy should be a no-op on the range from NULL to NULL.

The SECBUFFER_EXTRA in inputBuffers was processed by AcceptSecurityContext. Looking at the signature of the function, it claims the SECBUFFER_EXTRA can be emitted in the set of buffers in pOutput. SChannel looks like it's avoiding memory allocations by simply taking the SECBUFFER_TOKEN input buffer, incrementing its pointer by however many bytes it consumed, and changing the type. Perhaps AcceptSecurityContext didn't need to emit a SECBUFFER_EXTRA buffer, so set it to be an inconsistent state?

mark.benvenuto when you're back in and have gotten settled, can you take a look at this?

Generated at Thu Feb 08 04:38:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.