[SERVER-9078] MongoDB2.4 cluster insert much more slow than 2.2. Created: 22/Mar/13  Updated: 10/Dec/14  Resolved: 12/Apr/13

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 2.4.0
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: stronglee Assignee: James Wahlin
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux hz166-97 2.6.38-2-amd64
Debian6.0
2 MongoDB test cluster. each cluster composed of 3 shards.
One cluster deployed MongoDB2.4 and the other 2.2.3.


Participants:

 Description   

Hi,
I installed MongoDB2.4 recently and found that the insert performance is much more
pool than it was in 2.2.3.

Please help to try to find the reason.

Test schema:
Load 330W records from a mysql table and insert them into mongodb cluster, mark down
the insert time.
2.4 cluster total insert time: 630 seconds
2.2 cluster total insert time: 110 seconds

For higher insert speed, I pre split the collection into 60 chunks, each node contains
20 chunks. The shard key is (second of timestamp) << 32 + timestamp.

And I found that the io load of 2.4 is higher than 2.2:

iostax -m result for the test period(of one node):
2.4 cluster:

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda             112.22         0.00         5.35          0       3371

2.2 cluster:

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              75.69         0.00         9.55          0       2386

2.4 collection stats:

mongos> db.gm_item.stats()
{
        "sharded" : true,
        "ns" : "tw2_log_10148.gm_item",
        "count" : 3252213,
        "numExtents" : 48,
        "size" : 1907352352,
        "storageSize" : 2203701248,
        "totalIndexSize" : 974219456,
        "indexSizes" : {
                "_id_" : 105543984,
                "_sk_1" : 160118784,
                "dst_item_uuid_ex_1" : 82315968,
                "dst_role_1" : 150765440,
                "op_nuid_1" : 122656352,
                "src_item_uuid_ex_1" : 75987744,
                "src_role_1" : 186036704,
                "t_when_1" : 90794480
        },
        "avgObjSize" : 586.4783001605368,
        "nindexes" : 8,
        "nchunks" : 60,
        "shards" : {
                "mds1" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1085111,
                        "size" : 636530928,
                        "avgObjSize" : 586.6044377026866,
                        "storageSize" : 786976768,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 210907136,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 331463216,
                        "indexSizes" : {
                                "_id_" : 35214032,
                                "_sk_1" : 53389280,
                                "t_when_1" : 30292080,
                                "op_nuid_1" : 40863648,
                                "src_role_1" : 62121248,
                                "src_item_uuid_ex_1" : 25370128,
                                "dst_role_1" : 56864080,
                                "dst_item_uuid_ex_1" : 27348720
                        },
                        "ok" : 1
                },
                "mds2" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1081248,
                        "size" : 633920176,
                        "avgObjSize" : 586.2856402971381,
                        "storageSize" : 708362240,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 189837312,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 310622592,
                        "indexSizes" : {
                                "_id_" : 35091392,
                                "_sk_1" : 53242112,
                                "t_when_1" : 30185792,
                                "op_nuid_1" : 40814592,
                                "src_role_1" : 61826912,
                                "src_item_uuid_ex_1" : 25092144,
                                "dst_role_1" : 36988224,
                                "dst_item_uuid_ex_1" : 27381424
                        },
                        "ok" : 1
                },
                "mds3" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1085854,
                        "size" : 636901248,
                        "avgObjSize" : 586.5440915629541,
                        "storageSize" : 708362240,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 189837312,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 332133648,
                        "indexSizes" : {
                                "_id_" : 35238560,
                                "_sk_1" : 53487392,
                                "t_when_1" : 30316608,
                                "op_nuid_1" : 40978112,
                                "src_role_1" : 62088544,
                                "src_item_uuid_ex_1" : 25525472,
                                "dst_role_1" : 56913136,
                                "dst_item_uuid_ex_1" : 27585824
                        },
                        "ok" : 1
                }
        },
        "ok" : 1
}

2.2 collection stats:

mongos> db.gm_item.stats()
{
        "sharded" : true,
        "ns" : "tw2_log_10148.gm_item",
        "count" : 3252213,
        "numExtents" : 48,
        "size" : 1814262528,
        "storageSize" : 1888837632,
        "totalIndexSize" : 1123300640,
        "indexSizes" : {
                "_id_" : 105543984,
                "_sk_1" : 158892384,
                "dst_item_uuid_ex_1" : 103925136,
                "dst_role_1" : 170216144,
                "op_nuid_1" : 174484016,
                "src_item_uuid_ex_1" : 74916688,
                "src_role_1" : 186363744,
                "t_when_1" : 148958544
        },
        "avgObjSize" : 557.8547678150231,
        "nindexes" : 8,
        "nchunks" : 60,
        "shards" : {
                "mds1" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1085111,
                        "size" : 605454048,
                        "avgObjSize" : 557.9650819132789,
                        "storageSize" : 629612544,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 168730624,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 374951360,
                        "indexSizes" : {
                                "_id_" : 35214032,
                                "_sk_1" : 53397456,
                                "t_when_1" : 49775488,
                                "op_nuid_1" : 58082304,
                                "src_role_1" : 61974080,
                                "src_item_uuid_ex_1" : 25116672,
                                "dst_role_1" : 56815024,
                                "dst_item_uuid_ex_1" : 34576304
                        },
                        "ok" : 1
                },
                "mds3" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1081248,
                        "size" : 602956616,
                        "avgObjSize" : 557.6487688300926,
                        "storageSize" : 629612544,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 168730624,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 373079056,
                        "indexSizes" : {
                                "_id_" : 35091392,
                                "_sk_1" : 52023888,
                                "t_when_1" : 49513856,
                                "op_nuid_1" : 58433872,
                                "src_role_1" : 62137600,
                                "src_item_uuid_ex_1" : 24797808,
                                "dst_role_1" : 56528864,
                                "dst_item_uuid_ex_1" : 34551776
                        },
                        "ok" : 1
                },
                "mds4" : {
                        "ns" : "tw2_log_10148.gm_item",
                        "count" : 1085854,
                        "size" : 605851864,
                        "avgObjSize" : 557.9496543734241,
                        "storageSize" : 629612544,
                        "numExtents" : 16,
                        "nindexes" : 8,
                        "lastExtentSize" : 168730624,
                        "paddingFactor" : 1,
                        "systemFlags" : 1,
                        "userFlags" : 0,
                        "totalIndexSize" : 375270224,
                        "indexSizes" : {
                                "_id_" : 35238560,
                                "_sk_1" : 53471040,
                                "t_when_1" : 49669200,
                                "op_nuid_1" : 57967840,
                                "src_role_1" : 62252064,
                                "src_item_uuid_ex_1" : 25002208,
                                "dst_role_1" : 56872256,
                                "dst_item_uuid_ex_1" : 34797056
                        },
                        "ok" : 1
                }
        },
        "ok" : 1
}

I also tested the hashed shard plan. (the record insert timestamp is used to be a shard key). The insert performance is more terrible. It took 1100 seconds to insert same records.



 Comments   
Comment by James Wahlin [ 12/Apr/13 ]

No worries Strlee. Closing issue.

Comment by stronglee [ 12/Apr/13 ]

Hi James,

To eliminate individual difference between testing environments, I used one machine to do this test again on 2.2.4/2.4.1.
And the result is that both version's performance are the same as you mentioned.
I am sorry to bother you on this issue, please close this issue.

I will do further tests base on a cluster as I mentioned at first, and create an other issue to report if I still get problems.

Thanks,
Strlee

Comment by James Wahlin [ 11/Apr/13 ]

I performed a test between 2.2.3 and 2.4.1 where I took your example document, removed the _id field and inserted 3,000,000 times. I performed this insertion on an empty database several times and in each case 2.4.1 performed as well if not better than 2.2.3. My test was done in AWS on an m1.large instance with 4 EBS volumes in RAID 10.

To help trouble shoot further, can you provide:

Also, would you be willing to share to code you are using to perform the insertions?

Thanks,
James

Comment by stronglee [ 11/Apr/13 ]

Hi,

To simplify the test, I used a single mongod instance to do the same insert test and the result is the same.

Test collection fields:
'_id': ObjectID
't_when': ISODate
'op_multi': true/false
'op_nuid': NumberLong
'op_code': String
'op_detail': String
'src_role': String
'src_ip': NumberLong
'src_money': Int
'src_item_uuid_ex': Binary of uuid
'src_item_id': Int
'src_item_name': String
'src_item_many': Int
'dst_role': String
'dst_ip': NumberLong
'dst_money': Int
'dst_item_uuid_ex': Binary of uuid
'dst_item_id': Int
'dst_item_name': String
'dst_item_many': Int

A typical row of this collection:
{
"_id" : ObjectId("514bf360c07a1a7ae52e0dff"),
"_sk" : 540000,
"src_item_name" : "金玉富贵",
"src_item_id" : 4367,
"op_multi" : true,
"src_item_uuid_ex" : BinData(0,"EeKStbaKf7iAQzRAtYB8DA=="),
"op_nuid" : NumberLong("5858043329898798175"),
"src_role" : "最後的凨誑",
"src_item_many" : 1,
"src_ip" : 1999893602,
"t_when" : ISODate("2013-03-22T14:00:00Z"),
"dst_role" : "#系统#",
"src_item_detail" : BinData(0,"gAJ9cQEoVQh2b2ljZV9pZHECVQCFcQNVA3JwcnEESwBVBHN0YXJxBUsDVQR1dWlkcQZVELaKf7iStRHigEM0QLWAfAxxB1UDbnRycQhLClUEZmJldHEJSxBVD2JhcmdhaW5fdmlzaWJsZXEKSwFVCWJhbl9zcGFjZXELSwBLAIZxDFUGbm9zZWxscQ1LAVULbmVlZF90YXJnZXRxDksBVQZtYXJrZXJxD1UAVQZ1dXRpbWVxEEpY80tRVQJpZHERTQ8RVQVjd3JhcHESSwF1Lg=="),
"op_code" : "开宝箱",
"op_detail" : "消耗 0 16"
}

I deployed a mongod2.2.3 and a mongod2.4 on two machines with same hardware and system.
Then load 3000,000 rows of data from mysql and insert them into echo mongodb.

And here's the test result:

ver2.4
options:

{ auth: "true", config: "/home/xx/conf/mongodb/mongodb.conf", cpu: "true", dbpath: "/home/xx/mongodb/db", keyFile: "/home/xx/conf/mongodb/keyfile", logappend: "true", nohttpinterface: "true", noobjcheck: "true", pidfilepath: "/var/run/mongodb/mongodb.pid", port: 27017, replSet: "mds2", syslog: "true" }

index: t_when
insert time: 213 seconds.

index: t_when, op_nuid, src_role, src_item_uuid_ex, dst_role, dst_item_uuid_ex
insert time: 461 seconds.

ver2.2.3
options: options:

{ config: "/home/xx/conf/mongodb/mongodb.conf", dbpath: "/home/xx/mongodb/db", logpath: "/home/xx/log/mongodb/mongodb.log", pidfilepath: "/var/run/mongodb/mongodb.pid", port: 27017, replSet: "mds1" }

index: t_when
insert time: 115 seconds.

index: t_when, op_nuid, src_role, src_item_uuid_ex, dst_role, dst_item_uuid_ex
insert time: 361 seconds.

I didn't setup mms on these two test nodes.
But I think it's very easy to reproduce the problem: just insert large data and record the insert time, they will be different.

Comment by James Wahlin [ 10/Apr/13 ]

Let me know if you need further help on this.

Thanks,
James

Comment by James Wahlin [ 02/Apr/13 ]

Can you clarify your last statement? I am not sure what you mean by "more none sequential indexes" and "insert via mongos". Did you change your cluster configuration as well between your 2.2.3 and 2.4.1 test?

Also, if you would be willing to share your test code and cluster setup I can try to reproduce your results. Alternately if you setup your test cluster in MMS with Munin-Node and re-ran your tests, it could help us in understanding where the bottleneck lies.

Thanks,
James

Comment by stronglee [ 02/Apr/13 ]

Hi James Wahlin,

I am so sorry that I didn't reply to you because I was quite busy last days.
I will give more test results ASAP.

BTW, I have tested the insert speed on flashcache machines and the performance is also poor in 2.4.1 than in 2.2.3.

I think the key point is "more none sequential indexes" and "insert via mongos".

Comment by James Wahlin [ 01/Apr/13 ]

Hi stronglee,

Are you still encountering slowness in 2.4? If so can your provide the mongostat and top output requested by Eliot above?

Thanks,
James

Comment by Eliot Horowitz (Inactive) [ 26/Mar/13 ]

Can you send mongostat and top output from 2.2 and 2.4?

Comment by stronglee [ 23/Mar/13 ]

PS again, I have set noobjcheck = true.

Comment by stronglee [ 23/Mar/13 ]

PS.

Load 330W records from a mysql table and insert them into mongodb cluster, mark down
the insert time.

330W means 3300,000 records here.

Generated at Thu Feb 08 03:19:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.