[JAVA-2110] Performance regression between 2.13 and 3.2 drivers Created: 08/Feb/16  Updated: 20/Dec/19  Resolved: 20/Dec/19

Status: Closed
Project: Java Driver
Component/s: Performance, Query Operations
Affects Version/s: 3.2.0
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Abbass Marouni Assignee: Unassigned
Resolution: Won't Do Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2018-11-15 at 10.52.26 AM.png     PNG File mongo_10M_bench.jpeg.png    
Issue Links:
Depends
depends on JAVA-2112 Avoid double decoding of command resp... Closed
Duplicate
is duplicated by JAVA-3077 Java driver upgrade causing slowness ... Closed
Related
is related to JAVA-3089 Support document sequences in OP_MSG ... Closed

 Description   

We did some benchmarks before upgrading from the 2.13.2 to 3.2.1 and we noticed an important performance gap between the 2.

For writing 1 Million documents we noticed a gap of around 5% in favor of the 2.13.2 driver (Using the old com.mongodb.BasicDBObject API or the new CRUD API that uses org.bson.Document).

For reading 10 Million documents we noticed a even bigger gap of around 143% using the old API and around 50% for the new API, always in favor of the 2.13.2 driver.

What do you think of the benchmarks and do you have your own benchmarks ? Or maybe best practices for the New API.



 Comments   
Comment by Jeffrey Yemin [ 20/Dec/19 ]

I've decided to close this issue. The 3.0 driver was released four years ago now and we have to consider it the new normal. If we find specific areas of improvement that we can make going forward, we can open new issues to address them.

Comment by Jeffrey Yemin [ 05/Dec/18 ]

manan@indeed.com We'd love to see any realistic benchmark numbers that you are able to share.

Thanks,
Jeff

Comment by Manan Shah [ 05/Dec/18 ]

Jeff, I just wanted to update that we saw different results in production after we tested the driver upgrade. The performance for some operations were little bad as compared from the old driver but most operations were fine. I still don't have the accurate data to share but I can get that data from one of our teams if you need.

Comment by Jeffrey Yemin [ 15/Nov/18 ]

The data you posted here is different than what you posted on JAVA-3077. There the average latency baseline is much higher than what you posted above.

Currently our benchmarks, based on YCSB show around a 30% drop in performance between the 2.13 and 3.8 drivers. We are still evaluating root causes, but we haven't seen degradation as severe as what you're reporting. If you want to proceed further with an analysis, we're going to need you to share your benchmark code.

Comment by Manan Shah [ 15/Nov/18 ]

You can find the answer to that and much more in JAVA-3077 which was closed as a dup.

Comment by Jeffrey Yemin [ 15/Nov/18 ]

manan@indeed.com what version of the driver did you upgrade from and to?

Comment by Manan Shah [ 15/Nov/18 ]

Thank you Ian but I would like to have some solution sooner than later as this issue is stalling our upgrades. Here's you can see the performance on the read operations 95th Percentile time when the java driver was upgraded on Wed Nov 14 around 3 am. The difference is so much (from around ~40 ms to > 500 ms) that it is likely going to affect our service SLOs for these operations in production.

 

Comment by Ian Whalen (Inactive) [ 08/Nov/18 ]

Just a heads up for watchers on this ticket that we've moved the investigation along to our Product Performance team so they can do a deep dive. We'll update this ticket with what comes as the result of that investigation and what any next steps might be.

Comment by Jeffrey Yemin [ 29/Feb/16 ]

OK, thanks Abbass. As we uncover further causes for performance degradation, we'll link them to this ticket.

Regards,
Jeff

Comment by Abbass Marouni [ 29/Feb/16 ]

Hi Jeff,

So I did rewrote the benchmark to use some "real-world" document, here's what I came up with :

{
    "_id" : ObjectId("56d4082ef2dc1420e653cbdd"),
    "LEVEL_1" : {
        "2tLtcrw3XK" : {
            "t_str_1SDjNQNYeV" : "QBGhW0Q4NT1qGbH0OM3DiTx55f48AX",
            "t_int_855FWxJWsf" : 17293
        },
        "hYUVhy0WzB" : {
            "t_date_AIRz8ezBNV" : ISODate("2009-10-16T14:54:54.525Z"),
            "rxxg8dpGLV" : {
                "t_double_ZvRnGjkRb0" : 304.952809709614
            }
        },
        "UP65U2EVcG" : {
            "ZnqPoGJAbF" : {
                "t_int_lxX72DPdbF" : 1
            }
        },
        "XuKeX6IxtQ" : {
            "t_list_mMD5g34Xvo" : [ 
                "gBUFvImUYx", 
                "t0He4IB6t8", 
                "iFLwzdJkzc"
            ]
        },
        "Tg7CCrH9Ep" : {
            "t_list_Bgv42Ybkd4" : [ 
                [ 
                    523, 
                    564564, 
                    4654325
                ]
            ]
        }
    }
}

And here's the new parsing code for the Legacy API and the new API :

Legacy API :

while (cursor_tMongoDBInput_1.hasNext()) {
            com.mongodb.DBObject o_tMongoDBInput_1 = cursor_tMongoDBInput_1.next();
 
            com.mongodb.DBObject emb_level_1 = (com.mongodb.DBObject) o_tMongoDBInput_1.get("LEVEL_1");
            com.mongodb.DBObject emb_2tLtcrw3XK = (com.mongodb.DBObject) emb_level_1.get("2tLtcrw3XK");
            com.mongodb.DBObject emb_hYUVhy0WzB = (com.mongodb.DBObject) emb_level_1.get("hYUVhy0WzB");
            com.mongodb.DBObject emb_rxxg8dpGLV = (com.mongodb.DBObject) emb_hYUVhy0WzB.get("rxxg8dpGLV");
            com.mongodb.DBObject emb_UP65U2EVcG = (com.mongodb.DBObject) emb_level_1.get("UP65U2EVcG");
            com.mongodb.DBObject emb_ZnqPoGJAbF = (com.mongodb.DBObject) emb_UP65U2EVcG.get("ZnqPoGJAbF");
            com.mongodb.DBObject emb_XuKeX6IxtQ = (com.mongodb.DBObject) emb_level_1.get("XuKeX6IxtQ");
            com.mongodb.DBObject emb_Tg7CCrH9Ep = (com.mongodb.DBObject) emb_level_1.get("Tg7CCrH9Ep");
 
            row1.t_str_1SDjNQNYeV = (String) emb_2tLtcrw3XK.get("t_str_1SDjNQNYeV");
            row1.t_int_855FWxJWsf = (Integer) emb_2tLtcrw3XK.get("t_int_855FWxJWsf");
            row1.t_date_AIRz8ezBNV = (Date) emb_hYUVhy0WzB.get("t_date_AIRz8ezBNV");
            row1.t_double_ZvRnGjkRb0 = (Double) emb_rxxg8dpGLV.get("t_double_ZvRnGjkRb0");
            row1.t_int_lxX72DPdbF = (Integer) emb_ZnqPoGJAbF.get("t_int_lxX72DPdbF");
            row1.t_list_Bgv42Ybkd4 = (List<Integer>) emb_XuKeX6IxtQ.get("t_list_mMD5g34Xvo");
            row1.t_list_mMD5g34Xvo = (List<String>) emb_Tg7CCrH9Ep.get("t_list_Bgv42Ybkd4");
 
            sum += row1.t_int_lxX72DPdbF;
        }

New API :

while (cursor_tMongoDBInput_NEWAPI_1.hasNext()) {
            org.bson.Document o_tMongoDBInput_NEWAPI_1 = cursor_tMongoDBInput_NEWAPI_1.next();
 
            org.bson.Document emb_level_1 = (org.bson.Document) o_tMongoDBInput_NEWAPI_1.get("LEVEL_1");
            org.bson.Document emb_2tLtcrw3XK = (org.bson.Document) emb_level_1.get("2tLtcrw3XK");
            org.bson.Document emb_hYUVhy0WzB = (org.bson.Document) emb_level_1.get("hYUVhy0WzB");
            org.bson.Document emb_rxxg8dpGLV = (org.bson.Document) emb_hYUVhy0WzB.get("rxxg8dpGLV");
            org.bson.Document emb_UP65U2EVcG = (org.bson.Document) emb_level_1.get("UP65U2EVcG");
            org.bson.Document emb_ZnqPoGJAbF = (org.bson.Document) emb_UP65U2EVcG.get("ZnqPoGJAbF");
            org.bson.Document emb_XuKeX6IxtQ = (org.bson.Document) emb_level_1.get("XuKeX6IxtQ");
            org.bson.Document emb_Tg7CCrH9Ep = (org.bson.Document) emb_level_1.get("Tg7CCrH9Ep");
 
            row1.t_str_1SDjNQNYeV = (String) emb_2tLtcrw3XK.get("t_str_1SDjNQNYeV");
            row1.t_int_855FWxJWsf = (Integer) emb_2tLtcrw3XK.get("t_int_855FWxJWsf");
            row1.t_date_AIRz8ezBNV = (Date) emb_hYUVhy0WzB.get("t_date_AIRz8ezBNV");
            row1.t_double_ZvRnGjkRb0 = (Double) emb_rxxg8dpGLV.get("t_double_ZvRnGjkRb0");
            row1.t_int_lxX72DPdbF = (Integer) emb_ZnqPoGJAbF.get("t_int_lxX72DPdbF");
            row1.t_list_Bgv42Ybkd4 = (List<Integer>) emb_XuKeX6IxtQ.get("t_list_mMD5g34Xvo");
            row1.t_list_mMD5g34Xvo = (List<String>) emb_Tg7CCrH9Ep.get("t_list_Bgv42Ybkd4");
 
            sum += row1.t_int_lxX72DPdbF;
        }

Here's also the bean class :

public class RowBean {
 
    String t_str_1SDjNQNYeV;
    Integer t_int_855FWxJWsf;
    Date t_date_AIRz8ezBNV;
    Double t_double_ZvRnGjkRb0;
    Integer t_int_lxX72DPdbF;
    List<String> t_list_mMD5g34Xvo;
    List<Integer> t_list_Bgv42Ybkd4;
 
 
    public RowBean() {
    }
 
    @Override
    public String toString() {
        return "RowBean{" +
                "t_str_1SDjNQNYeV='" + t_str_1SDjNQNYeV + '\'' +
                ", t_int_855FWxJWsf=" + t_int_855FWxJWsf +
                ", t_date_AIRz8ezBNV=" + t_date_AIRz8ezBNV +
                ", t_double_ZvRnGjkRb0=" + t_double_ZvRnGjkRb0 +
                ", t_int_lxX72DPdbF=" + t_int_lxX72DPdbF +
                ", t_list_mMD5g34Xvo=" + t_list_mMD5g34Xvo +
                ", t_list_Bgv42Ybkd4=" + t_list_Bgv42Ybkd4 +
                '}';
    }
}

Here are the results (time in ms for reading 10 million documents) :

2.13.2 LEGACY API : 49052
2.14.1 LEGACY API : 49063

3.1.1 New API : 53341
3.2.1 New API : 95281
3.2.2 New API : 53712

We got closer results this time but the 2.x drivers still perform better than 3.x (around 10%).
What's interesting is that there's a clear regression between 3.1.1 and 3.2.1.

I think that we'll stick to the 2.14.1 driver for the time being and we'll upgrade the driver later on.

Abbass,

Comment by Jeffrey Yemin [ 27/Feb/16 ]

Hi Abbass,

The 2.x BSON decoder has an optimization for single character ASCII strings, which is not present in the 3.x driver. Are single character field names what you use in actual applications, or just for this microbenchmark? If the former, please re-run the benchmark with documents that more closely represent ones you would use in a production application. Using multi-character strings won't eliminate all of the performance degradation you're seeing, but profiling shows that it is a significant factor.

Regards,
Jeff

Comment by Jeffrey Yemin [ 27/Feb/16 ]

Hi Abbass,

To help us get to the root cause, please provide a representative document from the collection, or more than one if there is a variety.

Regards,
Jeff

Comment by Jeffrey Yemin [ 26/Feb/16 ]

Hi Abbass,

Would you mind running your benchmark against the 3.1.1 version of the driver?

Regards,
Jeff

Comment by Abbass Marouni [ 26/Feb/16 ]

Hi Jeff,

I got the new 3.2.2 driver and reran my tests(same code as above).
We're still seeing performance regression between the 2.X and 3.X drivers, here are my new results (I added the graph as an attachement) :

2.13.2 LEGACY API : 9669
2.14.1 LEGACY API : 9864
3.2.1 LEGACY API : 29239
3.2.2 LEGACY API : 18813
3.2.1 New API : 27910
3.2.2 New API : 17753

The new 3.2.2 is still 2 times slower than the 2.X drivers.

I'm wondering if this is due to connection establishment between 2.X and 3.X or if our test data (10 million documents with simple types) is not enough.

Comment by Jeffrey Yemin [ 10/Feb/16 ]

Hi Abbass,

There is no fixed release date but it shouldn't be more than a couple of weeks and likely sooner.

Comment by Abbass Marouni [ 10/Feb/16 ]

Thanks Jeff,

We were hesitant to upgrade the driver so now we'll wait for the 3.2.2, and it's good to see the issue being acknowledged quickly.
Any release dates for the 3.2.2 driver ?

Comment by Jeffrey Yemin [ 10/Feb/16 ]

Since there may be multiple issues here, I separated out the first one as JAVA-2112.

Comment by Jeffrey Yemin [ 09/Feb/16 ]

Never mind, I think I see the problem.

Comment by Jeffrey Yemin [ 09/Feb/16 ]

Would you mind executing your benchmarks against the 3.0 server and posting the results here?

Regards,
Jeff

Comment by Abbass Marouni [ 09/Feb/16 ]

Test environment :
mongodb-linux-x86_64-enterprise-ubuntu1404-3.2.1 in a Docker container on a 16GB RAM Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz with 500GB SSD.

Benchmark code for Reading 10 million documents :

New API with 3.2.1 driver

// Setup
        com.mongodb.MongoClientOptions clientOptions_tMongoDBConnection_NEWAPI_1 =
                new com.mongodb.MongoClientOptions.Builder().build();
 
        java.util.List<com.mongodb.MongoCredential> mongoCredentialList_tMongoDBConnection_NEWAPI_1 =
                new java.util.ArrayList<com.mongodb.MongoCredential>();
 
        com.mongodb.ServerAddress serverAddress_tMongoDBConnection_NEWAPI_1 =
                new com.mongodb.ServerAddress("mongodb.weave.local", 27017);
 
        com.mongodb.MongoClient mongo_tMongoDBConnection_NEWAPI_1 =
                new com.mongodb.MongoClient(serverAddress_tMongoDBConnection_NEWAPI_1,
                        mongoCredentialList_tMongoDBConnection_NEWAPI_1, clientOptions_tMongoDBConnection_NEWAPI_1);
 
        com.mongodb.client.MongoDatabase db_tMongoDBConnection_NEWAPI_1 =
                mongo_tMongoDBConnection_NEWAPI_1.getDatabase("default");
 
 
        com.mongodb.client.MongoCollection<org.bson.Document> coll_tMongoDBInput_NEWAPI_1 =
                db_tMongoDBConnection_NEWAPI_1.getCollection("benchM001");
 
 
        org.bson.Document myQuery_tMongoDBInput_NEWAPI_1 = org.bson.Document.parse("{}");
 
        org.bson.Document fields_tMongoDBInput_NEWAPI_1 = new org.bson.Document();
 
        com.mongodb.client.FindIterable<org.bson.Document> findIterable_tMongoDBInput_NEWAPI_1 =
                coll_tMongoDBInput_NEWAPI_1.find(myQuery_tMongoDBInput_NEWAPI_1);
 
        com.mongodb.client.MongoCursor<org.bson.Document> cursor_tMongoDBInput_NEWAPI_1 =
                findIterable_tMongoDBInput_NEWAPI_1.iterator();
 
        // RowBean to test parsing of values
        RowBean row1 = new RowBean();
 
        // Read verification
        long sum = 0;
 
        // Timer
        long start = System.currentTimeMillis();
        // Cursor
        while (cursor_tMongoDBInput_NEWAPI_1.hasNext()) {
            org.bson.Document o_tMongoDBInput_NEWAPI_1 = cursor_tMongoDBInput_NEWAPI_1.next();
            row1.a = (String) o_tMongoDBInput_NEWAPI_1.get("a");
            row1.b = (Date) o_tMongoDBInput_NEWAPI_1.get("b");
            row1.c = (Integer) o_tMongoDBInput_NEWAPI_1.get("c");
            row1.d = (Double) o_tMongoDBInput_NEWAPI_1.get("d");
            row1.e = (Integer) o_tMongoDBInput_NEWAPI_1.get("e");
            row1.f = (String) o_tMongoDBInput_NEWAPI_1.get("f");
            sum += row1.e;
        }
 
        System.out.println(System.currentTimeMillis() - start);
        System.out.println(sum);

Old API with 2.13.2 driver

// Setup
        com.mongodb.MongoClientOptions clientOptions_tMongoDBInput_1 =
                new com.mongodb.MongoClientOptions.Builder().build();
 
        java.util.List<com.mongodb.MongoCredential> mongoCredentialList_tMongoDBInput_1 =
                new java.util.ArrayList<com.mongodb.MongoCredential>();
 
        com.mongodb.ServerAddress serverAddress_tMongoDBInput_1 =
                new com.mongodb.ServerAddress("mongodb.weave.local", 27017);
 
        com.mongodb.Mongo mongo_tMongoDBInput_1 =
                new com.mongodb.MongoClient(serverAddress_tMongoDBInput_1, mongoCredentialList_tMongoDBInput_1,
                        clientOptions_tMongoDBInput_1);
 
        com.mongodb.DB db_tMongoDBInput_1 = mongo_tMongoDBInput_1.getDB("default");
 
        com.mongodb.DBCollection coll_tMongoDBInput_1 = db_tMongoDBInput_1.getCollection("benchM001");
 
        com.mongodb.DBObject myQuery_tMongoDBInput_1 = (com.mongodb.DBObject) com.mongodb.util.JSON.parse("{}");
 
        com.mongodb.DBObject fields_tMongoDBInput_1 = new com.mongodb.BasicDBObject();
 
        com.mongodb.DBCursor cursor_tMongoDBInput_1 =
                coll_tMongoDBInput_1.find(myQuery_tMongoDBInput_1,fields_tMongoDBInput_1);
 
        // RowBean to test parsing of values
        RowBean row1 = new RowBean();
 
        // Read verification
        long sum = 0;
 
        // Timer
        long start = System.currentTimeMillis();
 
        while (cursor_tMongoDBInput_1.hasNext()) {
            com.mongodb.DBObject o_tMongoDBInput_1 = cursor_tMongoDBInput_1.next();
            row1.a = (String) o_tMongoDBInput_1.get("a");
            row1.b = (Date) o_tMongoDBInput_1.get("b");
            row1.c = (Integer) o_tMongoDBInput_1.get("c");
            row1.d = (Double) o_tMongoDBInput_1.get("d");
            row1.e = (Integer) o_tMongoDBInput_1.get("e");
            row1.f = (String) o_tMongoDBInput_1.get("f");
            sum += row1.e;
        }
 
        System.out.println(System.currentTimeMillis() - start);
        System.out.println(sum);

The average of 3 runs :

New API with 3.2.1 driver : 27710 ms
Old API with 2.13.2 driver : 9753 ms

Please note that we got the same results when using the following parsing code with the new API + 3.2.1 driver :

...
            row1.a = o_tMongoDBInput_NEWAPI_1.getString("a");
            row1.b = o_tMongoDBInput_NEWAPI_1.getDate("b");
            row1.c = o_tMongoDBInput_NEWAPI_1.getInteger("c");
            row1.d = o_tMongoDBInput_NEWAPI_1.getDouble("d");
            row1.e = o_tMongoDBInput_NEWAPI_1.getInteger("e");
            row1.f = o_tMongoDBInput_NEWAPI_1.getString("f");
...

Best Regards,

Comment by Jeffrey Yemin [ 08/Feb/16 ]

Can you provide the benchmark code and the relevant details about the test environment?

Regards,
Jeff

Generated at Thu Feb 08 08:56:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.