[SERVER-10851] Time series data updates performance Created: 23/Sep/13  Updated: 11/Apr/23  Resolved: 11/Apr/23

Status: Closed
Project: Core Server
Component/s: Performance, Write Ops
Affects Version/s: 2.2.6, 2.4.5, 2.5.2
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Alexander Komyagin Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Server: RAID0 on 2 SSD, hi1.4xlarge, nojournal, syncDelay=15, single-node replica set
Client: m3.2xlarge


Attachments: File loadon.js     File loadon2.js     File loadup.js    
Issue Links:
Related
Assigned Teams:
Storage Execution
Participants:

 Description   

Use case: storing per-minute day aggregation data in one collection. Manual padding. Documents are updated with new per-minute data by setting the appropriate fields.

Plain update rate is measured through mongostat: ~ 2000 updates/s

insert  query update delete getmore command flushes mapped  vsize    res faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn set repl       time
    *0     *0   1694     *0       0     1|0       0   378g   378g  5.46g      0 test:95.1%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:26
    *0     *0   1690     *0       0     1|0       0   378g   378g  5.46g      0 test:95.1%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:27
    *0     *0   1700     *0       0     1|0       0   378g   378g  5.46g      0 test:95.1%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:28
    *0     *0   1702     *0       0     1|0       0   378g   378g  5.46g      1 test:95.2%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:29
    *0     *0   1685     *0       0     1|0       0   378g   378g  5.47g      1 test:95.3%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:30
    *0     *0   1725     *0       0     1|0       0   378g   378g  5.47g      0 test:95.0%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:31
    *0     *0   1701     *0       0     1|0       0   378g   378g  5.47g      1 test:95.1%          0       8|0     0|1     1m     2k    10 rs0  PRI   21:42:32

Sample doc:

{
        "_id" : ObjectId("5240805f14f235bd45c80120"),
        "key" : {
                "a" : 0,
                "b" : 0
        },
        "mn" : {
                "0" : {
                        "0" : 0,
                        "1" : 0,
                        "2" : 0,
                        ...
                        },
                ...
                "23" : {
                        "1380" : 0,
                        "1381" : 0,
                        "1382" : 0,
                        "1383" : 0,
                        "1384" : 0,
                        "1385" : 0,
                        "1386" : 0,
                        ...
                        "1436" : 0,
                        "1437" : 0,
                        "1438" : 0,
                        "1439" : 0
                }
        }
}

These results suggest that we are spending around 500 microsecs per each update operation (1000*0.95*1000/1700), which is considered very slow for such a powerful instances (note that disk or network was not a bottleneck).

Scripts attached:

  • loadup.js - script to load the sample data into the database
  • loadon.js - scripts to simulate the updates with full replacement of the "mn" field (gives 6k updates/s)
  • loadon2.js - scripts to simulate the updates with positional updates (original use-case, gives 1.7-2k updates/s)


 Comments   
Comment by Dianna Hohensee (Inactive) [ 11/Apr/23 ]

This ticket was filed long before MDB supported Time-series collections. Closing.

Generated at Thu Feb 08 03:24:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.