[CDRIVER-3965] Version 1.17.5 adds a 4 seconds delay on each query Created: 26/Apr/21 Updated: 27/Oct/23 Resolved: 07/May/21 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc |
| Affects Version/s: | 1.17.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Unknown |
| Reporter: | Olivier Bertrand | Assignee: | Kevin Albertson |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 10 64 bits |
||
| Issue Links: |
|
||||||||
| Description |
|
Since I switch to version the MongoDB C driver 1.17.5 (I was using 1.9) all use of this driver, that used to run in a few milliseconds, now takes more than 4 seconds to achieve. |
| Comments |
| Comment by Kevin Albertson [ 07/May/21 ] | ||||||||||||||||||||||||||||||||
|
Great, I am glad to hear bertrandop@free.fr! | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 07/May/21 ] | ||||||||||||||||||||||||||||||||
|
Yes, I did it and there is no more delay, thanks. | ||||||||||||||||||||||||||||||||
| Comment by Kevin Albertson [ 07/May/21 ] | ||||||||||||||||||||||||||||||||
|
Thank you for the additional information.
My apologies, I intended to ask for "tlsDisableCertificateRevocationCheck=true". If it is easy to check, trying with this option will help rule out whether OCSP is involved in the additional delay. Though given that you are observing a .5 second delay now, I suspect OCSP is unrelated. For reference, the URI options related to TLS and OCSP are documented in Configuring TLS.
That is good to know. I agree the 4 second delay seems caused by CDRIVER-3639. I have another hypothesis for the observed .5 second delay. Is your application creating and destroying a client pool (mongoc_client_pool_t) on each request that is handling a translation from SQL? Based on the snippet that you provided, that may be the case. If so, this may be related to changes in multi-threaded monitoring ( The Connection Pooling has additional details describing the behavior differences of pooled and single-threaded modes. But typical use of a mongoc_client_pool_t is to keep the pool alive for the duration of a long lived process. The discovered state of MongoDB servers can be reused on each pooled client and monitoring can continue in background threads without blocking operations. However, if your application is handling short lived requests and creating and destroying a mongoc_client_pool_t on each request, it may be preferable to use a single-threaded mongoc_client_t. That would result in less resources used (no background monitoring threads) and may avoid the .5 second delay on shutdown. The .5 second delay in pooled monitoring has been an issue for drivers wrapping the C driver testing: As a workaround, would it be possible to try replacing the mongoc_client_pool_t with a single-threaded mongoc_client_t? | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 29/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
Starting the server by "mongod --ipv6" does not remove the delay but reduces it from 4.50 sec to 0.50 sec. So it seems that there are 2 delays, a 4 sec one for ipv6 and a 0.50 sec one for another cause. Of course, none of them for older C drivers (last used 1.9.2) BTW, adding the option "tlsDisableOCSPEndpointCheck=true" to the URI make all the queries fail:
| ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 29/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
I am doing my testing using a local MongoDB server started by executing "mongod" with no parameters. The URI used by my program is simply "mongodb://localhost:27017". It was Ok with older releases of the C driver. Could you elaborate and show what I should try? | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 28/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
Indeed, only the first one has a delay. Subsequent operations have no delay. | ||||||||||||||||||||||||||||||||
| Comment by Kevin Albertson [ 28/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
Thank you for the additional information bertrandop@free.fr. I have two theories: 1. This could be related to failed connection attempts on hostnames resolving to multiple IPs. CDRIVER-3639 describes a 1-2 second delay observed on Windows when the hostname resolved to both an IPv6 and IPv4 address, but the server was only listening on IPv4. To determine if that is the cause, can you try specifying an IP literal in the URI to connect to and see if that resolves the delay? 2. This could be related to failed OCSP validation of server certificates if the client is taking a long time to reach an OCSP server. CDRIVER-3788 describes timeouts causing delays on each first connection. Can you try including tlsDisableOCSPEndpointCheck=true to bypass OCSP checks to see if that resolves the delay?
To check, do operations after the first one still have a delay? I would imagine if CDRIVER-3639 or CDRIVER-3788 is the cause, subsequent operations after a connection is established would no longer have a delay. In practice, this would be the first operation on a server per mongoc_client_t. | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 26/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
On another query where I do not execute the mongo collection count function, I continue by:
and the delay occurs on the first execution of mongoc_cursor_next. | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 26/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
By the way, I had in the past installed versions 1.6.2, 1.6.3, 1.7.0 and 1.9.2 from tarball with no problem. But when downloading version 1.17.5, after extracting and trying to follow what it described in the site explaining how to install, all try to generate the visual studio with cmake failed. Unfortunately I didn't keep a log of all these. Finally, I cloned it from the git site and then the cmake process and compilation by Visual Studio 2019 succeeded. The 4 second delay I am describing occurs as well with the Debug, Release or Relwithdebinfo versions. | ||||||||||||||||||||||||||||||||
| Comment by Olivier Bertrand [ 26/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
I'll try to do that but in fact I'm using this driver in a rather big application, the MariaDB Connect storage engine, in which I translate SQL queries in call to MongoDB through the C driver. For instance, reading the inventory collection, I execute in that order:
(The #if, #else, #endif are whether I use and old MongoDB version or the latest one) All is normal until I execute one of mongoc_collection_count function that takes more to 4/5 seconds to execute with the latest C Driver version when it takes almost no time with the old version. If I don't execute it, this 4 second delay occurs later but does occur anyway. | ||||||||||||||||||||||||||||||||
| Comment by Jeremy Mikola [ 26/Apr/21 ] | ||||||||||||||||||||||||||||||||
|
Can you clarify exactly what you mean by "all use of this driver"?
Some additional questions:
Capturing trace logs for a small application that reproduces the delay would also be helpful, as that will provide timestamps for various internal functions and should narrow down where this 4-second delay occurs. Trace logs may include sensitive information (e.g. byte dumps of socket IO), so you should be sure to redact anything sensitive within before sharing. |