[SERVER-30643] Performance regression with SSL Created: 14/Aug/17 Updated: 30/Oct/23 Resolved: 22/Aug/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking, Security |
| Affects Version/s: | 3.4.4 |
| Fix Version/s: | 3.4.9, 3.5.13 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Spencer Jackson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 3.4-SSL | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v3.4
|
||||||||||||||||||||||||
| Sprint: | Platforms 2017-08-21, Platforms 2017-09-11 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||
| Description |
|
Create a collection with a single 1 kB document; start 1000 threads doing a findOne() on that collection. Observed query rate ~55 k/s in 3.4.3, drops to ~35 k/s in 3.4.4. |
| Comments |
| Comment by Githook User [ 22/Aug/17 ] | ||||||||||||||||||||||||||||
|
Author: {'username': 'spencerjackson', 'email': 'spencer.jackson@mongodb.com', 'name': 'Spencer Jackson'}Message: (cherry picked from commit d40503d57d9fd27d1ad59e4d74aa827191d77564) | ||||||||||||||||||||||||||||
| Comment by Githook User [ 22/Aug/17 ] | ||||||||||||||||||||||||||||
|
Author: {'username': 'spencerjackson', 'email': 'spencer.jackson@mongodb.com', 'name': 'Spencer Jackson'}Message: | ||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 16/Aug/17 ] | ||||||||||||||||||||||||||||
|
The following change, partially reverting
| ||||||||||||||||||||||||||||
| Comment by Spencer Jackson [ 16/Aug/17 ] | ||||||||||||||||||||||||||||
|
SSL_read is supposed to return SSL_ERROR_WANT_READ on basically every read. When we see it, we have to recv a socket, and dump the contents into our network BIO. When we re-call SSL_read, it pulls that data out of the other half of that BIO. It may additionally return SSL_ERROR_WANT_WRITE, to force us to shuttle back TLS overhead to the client. Manually stepping across all calls to ERR_get_state during a single query for such a document as described in the description doesn't reveal any behavioral differences between 3.4.3 and 3.4.4. | ||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 14/Aug/17 ] | ||||||||||||||||||||||||||||
|
In 3.4.3 most of the threads are waiting here for the next request from a client:
Whereas in 3.4.4 most are waiting here:
Since there are no actual SSL errors logged, does this mean that for some reason in 3.4.4 SSL_read is returning SSL_ERROR_WANT_READ frequently resulting in more calls to SSL_get_error, creating a bottleneck? |