[SERVER-53870] Improve view creation performance over time Created: 19/Jan/21  Updated: 29/Oct/23  Resolved: 20/Apr/22

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 4.4.3
Fix Version/s: 6.1.0-rc0, 6.0.5

Type: Improvement Priority: Major - P3
Reporter: Dmitry Agranat Assignee: Shin Yee Tan
Resolution: Fixed Votes: 0
Labels: newgrad, perf-escapes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screenshot 2021-01-19 at 15.54.01.png     File diagnostic.data_Jan21_perf_gdb.tar.gz     File gdb_views.tar.gz     File mongodb_Jan21_perf_gdb.log.tar.gz     File perf_views.tar.gz    
Issue Links:
Backports
Duplicate
is duplicated by SERVER-53925 view creation performance decreases w... Closed
Problem/Incident
causes SERVER-66355 Pass dbName to validateViewDefinition... Closed
Related
related to SERVER-63696 Replace generic DurableViewCatalog::o... Backlog
related to SERVER-77572 Create workload to test create and dr... Backlog
is related to SERVER-78615 Poor view drop performance leads to r... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v6.0, v5.0, v4.4
Sprint: Execution Team 2021-12-13, Execution Team 2021-12-27, Execution Team 2022-01-10, Execution Team 2022-01-24, Execution Team 2022-02-07, Execution Team 2022-02-21, Execution Team 2022-03-07, Execution Team 2022-03-21, Execution Team 2022-04-04, Execution Team 2022-04-18, Execution Team 2022-05-02
Participants:
Case:
Linked BF Score: 105

 Description   

Even though view creation is rather fast, in my experiment it took 100 seconds to create 5000 views, view creation degrades over time. Currently, we take collection MODE_X lock and need to reload the view catalog during each view creation.

The script I used is a single-threaded view creation but running the same script against different collections in parallel does not improve the total execution time as it seems that reloading the view catalog is the bottleneck.

Repro:

function main() {
    var views_to_create = 5000;
    print("Starting...");
    db = db.getSiblingDB("views_db");
    db.dropDatabase();
    print("DB Dropped");
    var coll_name = "my_collection";
    db.createCollection(coll_name);
     print("DB/Collection Created");
     var results=[];
     for (var i=0; i < views_to_create; i=i+1){
         var field = 'field_' + Math.floor(Math.random() * 10000).toString();
         var pipeline = [{'$project': { }}];
         pipeline[0]['$project'][field] = '$a';
         var start = new Date();
         db.runCommand({"create": "view_" + i,
            "viewOn": coll_name, 
            "pipeline": pipeline});
         var end =  new Date();
        if (i % 100 == 0){
            print ((end-start) + " taken for creating view " + i);
        }
     }
}
main();



 Comments   
Comment by Githook User [ 25/Jan/23 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-53870 Improve view creation performance over time by avoiding reloading views from disk

(cherry picked from commit be752f7877f795faa42432be79039faf2b968660)
Branch: v6.0
https://github.com/mongodb/mongo/commit/c58f4997c05d987a90a77c7870c0e2d742dda86a

Comment by Bruce Lucas (Inactive) [ 23/Nov/22 ]

Opening backport requests to formally ask, per the request in HELP-39822.

Comment by Githook User [ 20/Apr/22 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-53870 Improve view creation performance over time by avoiding reloading views from disk
Branch: master
https://github.com/mongodb/mongo/commit/be752f7877f795faa42432be79039faf2b968660

Generated at Thu Feb 08 05:32:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.