Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-19304

Reduce Document Validation Overhead for mmapv1 Storage Engine

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Fix
    • Affects Version/s: 3.1.5
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      The following are code excerpts from the mongo-perf js tests used when the difference in overhead between the storage engine was observed. The baseline case basically measures the throughput of inserting documents that has 20 integer fields. The compare case does the same but has a validation filter set up to ensure that all 20 fields are present, and are integers before a document is inserted.

      tests.push( {   name: "Insert.DocValidation.TwentyInt.Baseline", 
                      tags: ['insert', 'baseline'], 
                      pre: function( collection) {
                          collection.drop();
                      },
                      ops: [ {
                          op: "insert",
                          doc: {
                              a: {"#RAND_INT": [0, 10000]},
                              b: {"#RAND_INT": [0, 10000]},
                              c: {"#RAND_INT": [0, 10000]},
                              d: {"#RAND_INT": [0, 10000]},
                              e: {"#RAND_INT": [0, 10000]},
                              f: {"#RAND_INT": [0, 10000]},
                              g: {"#RAND_INT": [0, 10000]},
                              h: {"#RAND_INT": [0, 10000]},
                              i: {"#RAND_INT": [0, 10000]},
                              j: {"#RAND_INT": [0, 10000]},
                              k: {"#RAND_INT": [0, 10000]},
                              l: {"#RAND_INT": [0, 10000]},
                              m: {"#RAND_INT": [0, 10000]},
                              n: {"#RAND_INT": [0, 10000]},
                              o: {"#RAND_INT": [0, 10000]},
                              p: {"#RAND_INT": [0, 10000]},
                              q: {"#RAND_INT": [0, 10000]},
                              r: {"#RAND_INT": [0, 10000]},
                              s: {"#RAND_INT": [0, 10000]},
                              t: {"#RAND_INT": [0, 10000]}
                          } }
      ]});
       
       
      tests.push( {   name: "Insert.DocValidation.TwentyInt", 
                      tags: ['insert', 'baseline'], 
                      pre: function( collection) {
                          collection.drop();
                          collection.runCommand("create", {"validator": {
                              $and: [
                                  {a: {$exists: true}},
                                  {a: {$type: 16}},
                                  {b: {$exists: true}},
                                  {b: {$type: 16}},
                                  {c: {$exists: true}},
                                  {c: {$type: 16}},
                                  {d: {$exists: true}},
                                  {d: {$type: 16}},
                                  {e: {$exists: true}},
                                  {e: {$type: 16}},
                                  {f: {$exists: true}},
                                  {f: {$type: 16}},
                                  {g: {$exists: true}},
                                  {g: {$type: 16}},
                                  {h: {$exists: true}},
                                  {h: {$type: 16}},
                                  {a: {$exists: true}},
                                  {a: {$type: 16}},
                                  {i: {$exists: true}},
                                  {i: {$type: 16}},
                                  {j: {$exists: true}},
                                  {j: {$type: 16}},
                                  {k: {$exists: true}},
                                  {k: {$type: 16}},
                                  {l: {$exists: true}},
                                  {l: {$type: 16}},
                                  {m: {$exists: true}},
                                  {m: {$type: 16}},
                                  {n: {$exists: true}},
                                  {n: {$type: 16}},
                                  {o: {$exists: true}},
                                  {o: {$type: 16}},
                                  {p: {$exists: true}},
                                  {p: {$type: 16}},
                                  {q: {$exists: true}},
                                  {q: {$type: 16}},
                                  {r: {$exists: true}},
                                  {r: {$type: 16}},
                                  {s: {$exists: true}},
                                  {s: {$type: 16}},
                                  {t: {$exists: true}},
                                  {t: {$type: 16}},
                              ] }});
                      },
                      ops: [ {
                          op: "insert",
                          doc: {
                              a: {"#RAND_INT": [0, 10000]},
                              b: {"#RAND_INT": [0, 10000]},
                              c: {"#RAND_INT": [0, 10000]},
                              d: {"#RAND_INT": [0, 10000]},
                              e: {"#RAND_INT": [0, 10000]},
                              f: {"#RAND_INT": [0, 10000]},
                              g: {"#RAND_INT": [0, 10000]},
                              h: {"#RAND_INT": [0, 10000]},
                              i: {"#RAND_INT": [0, 10000]},
                              j: {"#RAND_INT": [0, 10000]},
                              k: {"#RAND_INT": [0, 10000]},
                              l: {"#RAND_INT": [0, 10000]},
                              m: {"#RAND_INT": [0, 10000]},
                              n: {"#RAND_INT": [0, 10000]},
                              o: {"#RAND_INT": [0, 10000]},
                              p: {"#RAND_INT": [0, 10000]},
                              q: {"#RAND_INT": [0, 10000]},
                              r: {"#RAND_INT": [0, 10000]},
                              s: {"#RAND_INT": [0, 10000]},
                              t: {"#RAND_INT": [0, 10000]}
                          } }
      ]});
      
      

      Show
      The following are code excerpts from the mongo-perf js tests used when the difference in overhead between the storage engine was observed. The baseline case basically measures the throughput of inserting documents that has 20 integer fields. The compare case does the same but has a validation filter set up to ensure that all 20 fields are present, and are integers before a document is inserted. tests.push( { name: "Insert.DocValidation.TwentyInt.Baseline", tags: ['insert', 'baseline'], pre: function( collection) { collection.drop(); }, ops: [ { op: "insert", doc: { a: {"#RAND_INT": [0, 10000]}, b: {"#RAND_INT": [0, 10000]}, c: {"#RAND_INT": [0, 10000]}, d: {"#RAND_INT": [0, 10000]}, e: {"#RAND_INT": [0, 10000]}, f: {"#RAND_INT": [0, 10000]}, g: {"#RAND_INT": [0, 10000]}, h: {"#RAND_INT": [0, 10000]}, i: {"#RAND_INT": [0, 10000]}, j: {"#RAND_INT": [0, 10000]}, k: {"#RAND_INT": [0, 10000]}, l: {"#RAND_INT": [0, 10000]}, m: {"#RAND_INT": [0, 10000]}, n: {"#RAND_INT": [0, 10000]}, o: {"#RAND_INT": [0, 10000]}, p: {"#RAND_INT": [0, 10000]}, q: {"#RAND_INT": [0, 10000]}, r: {"#RAND_INT": [0, 10000]}, s: {"#RAND_INT": [0, 10000]}, t: {"#RAND_INT": [0, 10000]} } } ]});     tests.push( { name: "Insert.DocValidation.TwentyInt", tags: ['insert', 'baseline'], pre: function( collection) { collection.drop(); collection.runCommand("create", {"validator": { $and: [ {a: {$exists: true}}, {a: {$type: 16}}, {b: {$exists: true}}, {b: {$type: 16}}, {c: {$exists: true}}, {c: {$type: 16}}, {d: {$exists: true}}, {d: {$type: 16}}, {e: {$exists: true}}, {e: {$type: 16}}, {f: {$exists: true}}, {f: {$type: 16}}, {g: {$exists: true}}, {g: {$type: 16}}, {h: {$exists: true}}, {h: {$type: 16}}, {a: {$exists: true}}, {a: {$type: 16}}, {i: {$exists: true}}, {i: {$type: 16}}, {j: {$exists: true}}, {j: {$type: 16}}, {k: {$exists: true}}, {k: {$type: 16}}, {l: {$exists: true}}, {l: {$type: 16}}, {m: {$exists: true}}, {m: {$type: 16}}, {n: {$exists: true}}, {n: {$type: 16}}, {o: {$exists: true}}, {o: {$type: 16}}, {p: {$exists: true}}, {p: {$type: 16}}, {q: {$exists: true}}, {q: {$type: 16}}, {r: {$exists: true}}, {r: {$type: 16}}, {s: {$exists: true}}, {s: {$type: 16}}, {t: {$exists: true}}, {t: {$type: 16}}, ] }}); }, ops: [ { op: "insert", doc: { a: {"#RAND_INT": [0, 10000]}, b: {"#RAND_INT": [0, 10000]}, c: {"#RAND_INT": [0, 10000]}, d: {"#RAND_INT": [0, 10000]}, e: {"#RAND_INT": [0, 10000]}, f: {"#RAND_INT": [0, 10000]}, g: {"#RAND_INT": [0, 10000]}, h: {"#RAND_INT": [0, 10000]}, i: {"#RAND_INT": [0, 10000]}, j: {"#RAND_INT": [0, 10000]}, k: {"#RAND_INT": [0, 10000]}, l: {"#RAND_INT": [0, 10000]}, m: {"#RAND_INT": [0, 10000]}, n: {"#RAND_INT": [0, 10000]}, o: {"#RAND_INT": [0, 10000]}, p: {"#RAND_INT": [0, 10000]}, q: {"#RAND_INT": [0, 10000]}, r: {"#RAND_INT": [0, 10000]}, s: {"#RAND_INT": [0, 10000]}, t: {"#RAND_INT": [0, 10000]} } } ]});

      Description

      While measuring the overhead of document validation, it is noticed that the overhead from doing document validation is higher when the mmapv1 storage engine is used compared to wiredTiger. When inserting documents that has 20 integer fields, adding a validator on all 20 fields in wiredTiger showed a throughput drop of ~10% but mmapv1 throughput dropped by ~25%. We should look at ways to reduce the overhead for mmapv1. The big overhead for mmapv1 was observed in both 3.1.4 and 3.1.5.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: