[SERVER-19304] Reduce Document Validation Overhead for mmapv1 Storage Engine Created: 07/Jul/15  Updated: 06/Dec/22  Resolved: 01/Feb/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.1.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Chung-yen Chang Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Storage Execution
Operating System: ALL
Steps To Reproduce:

The following are code excerpts from the mongo-perf js tests used when the difference in overhead between the storage engine was observed. The baseline case basically measures the throughput of inserting documents that has 20 integer fields. The compare case does the same but has a validation filter set up to ensure that all 20 fields are present, and are integers before a document is inserted.

tests.push( {   name: "Insert.DocValidation.TwentyInt.Baseline", 
                tags: ['insert', 'baseline'], 
                pre: function( collection) {
                    collection.drop();
                },
                ops: [ {
                    op: "insert",
                    doc: {
                        a: {"#RAND_INT": [0, 10000]},
                        b: {"#RAND_INT": [0, 10000]},
                        c: {"#RAND_INT": [0, 10000]},
                        d: {"#RAND_INT": [0, 10000]},
                        e: {"#RAND_INT": [0, 10000]},
                        f: {"#RAND_INT": [0, 10000]},
                        g: {"#RAND_INT": [0, 10000]},
                        h: {"#RAND_INT": [0, 10000]},
                        i: {"#RAND_INT": [0, 10000]},
                        j: {"#RAND_INT": [0, 10000]},
                        k: {"#RAND_INT": [0, 10000]},
                        l: {"#RAND_INT": [0, 10000]},
                        m: {"#RAND_INT": [0, 10000]},
                        n: {"#RAND_INT": [0, 10000]},
                        o: {"#RAND_INT": [0, 10000]},
                        p: {"#RAND_INT": [0, 10000]},
                        q: {"#RAND_INT": [0, 10000]},
                        r: {"#RAND_INT": [0, 10000]},
                        s: {"#RAND_INT": [0, 10000]},
                        t: {"#RAND_INT": [0, 10000]}
                    } }
]});
 
 
tests.push( {   name: "Insert.DocValidation.TwentyInt", 
                tags: ['insert', 'baseline'], 
                pre: function( collection) {
                    collection.drop();
                    collection.runCommand("create", {"validator": {
                        $and: [
                            {a: {$exists: true}},
                            {a: {$type: 16}},
                            {b: {$exists: true}},
                            {b: {$type: 16}},
                            {c: {$exists: true}},
                            {c: {$type: 16}},
                            {d: {$exists: true}},
                            {d: {$type: 16}},
                            {e: {$exists: true}},
                            {e: {$type: 16}},
                            {f: {$exists: true}},
                            {f: {$type: 16}},
                            {g: {$exists: true}},
                            {g: {$type: 16}},
                            {h: {$exists: true}},
                            {h: {$type: 16}},
                            {a: {$exists: true}},
                            {a: {$type: 16}},
                            {i: {$exists: true}},
                            {i: {$type: 16}},
                            {j: {$exists: true}},
                            {j: {$type: 16}},
                            {k: {$exists: true}},
                            {k: {$type: 16}},
                            {l: {$exists: true}},
                            {l: {$type: 16}},
                            {m: {$exists: true}},
                            {m: {$type: 16}},
                            {n: {$exists: true}},
                            {n: {$type: 16}},
                            {o: {$exists: true}},
                            {o: {$type: 16}},
                            {p: {$exists: true}},
                            {p: {$type: 16}},
                            {q: {$exists: true}},
                            {q: {$type: 16}},
                            {r: {$exists: true}},
                            {r: {$type: 16}},
                            {s: {$exists: true}},
                            {s: {$type: 16}},
                            {t: {$exists: true}},
                            {t: {$type: 16}},
                        ] }});
                },
                ops: [ {
                    op: "insert",
                    doc: {
                        a: {"#RAND_INT": [0, 10000]},
                        b: {"#RAND_INT": [0, 10000]},
                        c: {"#RAND_INT": [0, 10000]},
                        d: {"#RAND_INT": [0, 10000]},
                        e: {"#RAND_INT": [0, 10000]},
                        f: {"#RAND_INT": [0, 10000]},
                        g: {"#RAND_INT": [0, 10000]},
                        h: {"#RAND_INT": [0, 10000]},
                        i: {"#RAND_INT": [0, 10000]},
                        j: {"#RAND_INT": [0, 10000]},
                        k: {"#RAND_INT": [0, 10000]},
                        l: {"#RAND_INT": [0, 10000]},
                        m: {"#RAND_INT": [0, 10000]},
                        n: {"#RAND_INT": [0, 10000]},
                        o: {"#RAND_INT": [0, 10000]},
                        p: {"#RAND_INT": [0, 10000]},
                        q: {"#RAND_INT": [0, 10000]},
                        r: {"#RAND_INT": [0, 10000]},
                        s: {"#RAND_INT": [0, 10000]},
                        t: {"#RAND_INT": [0, 10000]}
                    } }
]});

Participants:

 Description   

While measuring the overhead of document validation, it is noticed that the overhead from doing document validation is higher when the mmapv1 storage engine is used compared to wiredTiger. When inserting documents that has 20 integer fields, adding a validator on all 20 fields in wiredTiger showed a throughput drop of ~10% but mmapv1 throughput dropped by ~25%. We should look at ways to reduce the overhead for mmapv1. The big overhead for mmapv1 was observed in both 3.1.4 and 3.1.5.


Generated at Thu Feb 08 03:50:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.