[CDRIVER-2376] Performance loss with small reads and writes Created: 15/Nov/17  Updated: 28/Oct/23  Resolved: 20/Dec/17

Status: Closed
Project: C Driver
Component/s: libmongoc
Affects Version/s: 1.9.0
Fix Version/s: 1.9.0

Type: Bug Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to CDRIVER-2343 /BulkOperation/numerous tests are tim... Closed

 Description   

Somewhere between commits a3c8a76 and 470fb4175c we lost a lot of performance in TestRunCommand, TestSmallDocInsertOne, and TestFindOneByID. The tests that read and write larger messages have such noisy results it's possible we missed a regression there too.

https://evergreen.mongodb.com/task/mongo_c_driver_perf_c_driver_benchmark_mongo32_BenchMarkTests_76791d6e7dc01cd1b55d838eaf16fc4f2291d746_17_11_14_19_43_24

There's a large number of changes between the benchmark runs, so it's hard to say if it's changes to compression, or changes to the OP_QUERY path related to implementing OP_MSG, or if it's a mistake in CDRIVER-2172.

The C Driver's performance benchmark suite is in a separate repo:

https://github.com/mongodb/mongo-c-driver-performance

Its Evergreen configuration is in the main repo, however:

https://github.com/mongodb/mongo-c-driver/blob/master/.evergreen/benchmark.yml

Instructions for running the benchmark suite are in that mongo-c-driver-performance README.

Some performance tests use JSON files. Download them with the download-test-data.py program in the mongo-c-driver-performance repo.

Make sure you've built and installed a release version of libbson and libmongoc for the correct version that you want to test. Build libbson something like:

git checkout SOME_GIT_HASH
mkdir cmake-build-release
cd cmake-build-release
cmake -DCMAKE_BUILD_TYPE=RELEASE .. && make && sudo make install

Same for libmongoc. Not all versions of libbson and libmongoc are compatible with each other (because of the history of this year's development),

Now in the mongo-c-driver-performance repo, build the mongo-c-performance program. Start a MongoDB 3.2 standalone on localhost. Run the tests:

./mongo-c-performance performance-testdata/ TestRunCommand

You can specify multiple test names on the command line.



 Comments   
Comment by Githook User [ 20/Dec/17 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: CDRIVER-2376 fix performance loss

Starting in commit #9f9934dcd, the driver had incorrectly called "ping"
before nearly every command when in single-threaded mode. This was due
to the mongoc_topology_scanner_node_t.last_update field not being
updated, so the driver thought each server's socket had been idle and
needed to be checked with "ping", even when the socket had been used
recently.

In this commit, update last_used after every command so we don't call
"ping" unnecessarily.
Branch: r1.9
https://github.com/mongodb/mongo-c-driver/commit/80628aa2cbdc7c324cca456ff442bd2f54357666

Comment by Githook User [ 20/Dec/17 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: CDRIVER-2376 fix performance loss

Starting in commit #9f9934dcd, the driver had incorrectly called "ping"
before nearly every command when in single-threaded mode. This was due
to the mongoc_topology_scanner_node_t.last_update field not being
updated, so the driver thought each server's socket had been idle and
needed to be checked with "ping", even when the socket had been used
recently.

In this commit, update last_used after every command so we don't call
"ping" unnecessarily.
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/303282b729d22dc0fc31d2080fc148900e392d81

Comment by A. Jesse Jiryu Davis [ 20/Dec/17 ]

The bad commit appears to be mine, 9f9934dcd, "Check idle socket with ping", which I had suspected was the culprit. If this is truly the commit that hurt performance, it's probably because the driver is calling "ping" very often on the server, instead of calling "ping" only after 1 second of idleness.

In that commit I changed mongoc_cluster_check_interval from calling "ismaster" after a second of idleness to "ping", which is according to spec. But there's an unintended consequence: since I no longer process an ismaster response with mongoc_topology_scanner_ismaster_handler, I no longer update mongoc_topology_scanner_node_t.last_used. Therefore, every call to mongoc_cluster_check_interval believes the socket's been idle and it calls "ping" again.

The solution is to find an appropriate place to update last_used.

(I notice that, even before that commit, mongoc_cluster_check_interval had had a bug. Since last_used was only updated by mongoc_topology_scanner_ismaster_handler, that means mongoc_cluster_check_interval called "ping" on a socket a second after the last call to mongoc_cluster_check_interval, no matter what, instead of after a second of idleness. But that wasn't a noticeable performance hit.)

Generated at Wed Feb 07 21:15:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.