[SERVER-35125] null pointer read access violation in SSLHandshakeManager::doServerHandshake Created: 21/May/18 Updated: 29/Oct/23 Resolved: 12/Jul/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Security |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | Mark Benvenuto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.0
|
||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
Seen in Evergreen C Driver tests. Standalone MongoDB v3.7.9-343-g6ab1592260 on Windows, the C Driver is built with TLS support from Microsoft's SChannel. It's attempting to send an isMaster command to test some details of the client's async SSL implementation. It crashes the server with:
The log file as attached. The minidump, unfortunately, is lost. The only notable thing about this test is that the client does not do SNI, whereas I believe it normally does with SChannel. |
| Comments |
| Comment by Githook User [ 13/Jul/18 ] | ||||||||||||||
|
Author: {'username': 'markbenvenuto', 'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com'}Message: (cherry picked from commit dbd844703ad91562c9fe921b6911f57c29a3f0ea) | ||||||||||||||
| Comment by Githook User [ 12/Jul/18 ] | ||||||||||||||
|
Author: {'username': 'markbenvenuto', 'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com'}Message: | ||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 12/Jul/18 ] | ||||||||||||||
|
Undocumented but potentially UB causing behavior in a crypto/TLS library feels like something we ought to make an effort to report upstream. | ||||||||||||||
| Comment by Mark Benvenuto [ 12/Jul/18 ] | ||||||||||||||
|
I have no idea if this is a bug in Microsoft's SChannel or a simply unclear documentation. For an unknown reason, Schannel AcceptSecurityContext will leave the SECBUFFER_EXTRA with a pvBuffer as null but a cbBuffer >0. I do not have a deterministic repro so it hard to figure out the criteria to trigger this issue. The C-Driver occasionally fails when doing a full test run so I have no idea how to make a test case. I had to commit mdmp/core dump collection to the C-Driver to help debug this. | ||||||||||||||
| Comment by Mark Benvenuto [ 11/Jul/18 ] | ||||||||||||||
|
As discussed in an earlier comment, the issue is that inputBuffers[1] is set inconsistently. It has pvBuffer == nullptr but cbBuffer == 7. This was confirmed with minidumps from the crashes. In this case, the original input to AcceptSecurityContext is 176 bytes in inputBuffers[0]. When AcceptSecurityContext completes, it leaves the input buffers in the following state:
It is the second one that is confusing since all the data has been consumed by the first input buffer. The fix is to enhance the condition to ignore the SECBUFFER_EXTRA when pvBuffer is nullptr. | ||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 09/Jul/18 ] | ||||||||||||||
|
Here's another: | ||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 09/Jul/18 ] | ||||||||||||||
|
mark.benvenuto - Please note another instance of this in | ||||||||||||||
| Comment by Mark Benvenuto [ 25/May/18 ] | ||||||||||||||
|
I do not understand how this could happen. I have tried to run the failing tests against release and debug builds of mongod with window heap checks enabled via gflags.exe but I have not been able to repro the original issue. If I examine the assembly code, the failing code is this line:
which is called by
This std::copy gets translated into memmove in release builds, and this then failed on the instruction below:
This is the assembly code for moving 7 bytes. This means that the pointer to the extra buffer returned by SChannel was NULL in pvBuffer, but it still set cbBuffer to 7. This does not seem possible. | ||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 24/May/18 ] | ||||||||||||||
|
Right, controlled by environment variables. Set the following env vars:
Passing the "-l" argument to test-libmongoc.exe will select the tests that crashed the server. See CONTRIBUTING.md in the C Driver repo for more info on running tests. Note that we rearranged the repo a couple days ago so the libmongoc source, tests, and certs are now under src/libmongoc/. | ||||||||||||||
| Comment by Mark Benvenuto [ 23/May/18 ] | ||||||||||||||
|
jesse How do I run the test program locally such that I run the tests in the same way? Are these settings controlled via environment variables? I know that the test does not fail reliably but I want to run it locally anyway with more checks, etc. From the test logs, this is a snippet of the relevant settings:
| ||||||||||||||
| Comment by Spencer Jackson [ 22/May/18 ] | ||||||||||||||
|
The crash happens in this code, where the call to append is on 330:
reset was called successfully, which implies that _pExtraEncryptedBuffer points to valid memory. The crash took place in a memcpy, and was caused by a read on NULL. So either the ReusableBuffer that _pExtraEncryptedBuffer points at contained a NULL buffer, or inputBuffers[1].pvBuffer was NULL. ReusableBuffer::append performs two memory copy operations. First it resizes its buffer, with a memcpy_s to populate the new buffer, then uses std::copy to load the new bytes into the new buffer. memcpy_s should execute a callback if either of its arguments are NULL, which may terminate the program. I'm not sure what the behavior of the default callback for our binaries does though. Asking around std::copy should be a no-op on the range from NULL to NULL. The SECBUFFER_EXTRA in inputBuffers was processed by AcceptSecurityContext. Looking at the signature of the function, it claims the SECBUFFER_EXTRA can be emitted in the set of buffers in pOutput. SChannel looks like it's avoiding memory allocations by simply taking the SECBUFFER_TOKEN input buffer, incrementing its pointer by however many bytes it consumed, and changing the type. Perhaps AcceptSecurityContext didn't need to emit a SECBUFFER_EXTRA buffer, so set it to be an inconsistent state? mark.benvenuto when you're back in and have gotten settled, can you take a look at this? |