[SERVER-83671] Implement lookup tries for IDL parse in C++ Created: 28/Nov/23  Updated: 05/Feb/24

Status: In Progress
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Billy Donahue Assignee: Billy Donahue
Resolution: Unresolved Votes: 0
Labels: perf-tiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File trie.txt    
Issue Links:
Depends
depends on SERVER-85497 IDL: make gen_trie generate a complet... Closed
Related
is related to SERVER-78439 Investigate IDL generated parser perf... Closed
is related to SERVER-81876 Improve IDL code generation for comma... Closed
Assigned Teams:
Service Arch
Sprint: Service Arch 2024-01-22, Service Arch 2024-02-05, Service Arch 2024-02-19
Participants:

 Description   

Currently IDL uses tries, rendered into C++ as custom find functions generated for each hardcoded trie instance table, encoding the table's contents into the code as literals.

We should be able to do something much more constexpr(?) C++ container-like.

maybe a third_party solution.

Sample API for a minimal wrapper to be called from Python code:

class SomeIdlGeneratedStruct {
    ...
    enum class FieldId {
        aField,
        someOtherField,
        yetAnotherField,
    };
 
    static constexpr inline FieldNameMap<FieldId> _fieldNames{
        {"aField", FieldIds::aField},
        {"someOtherField", FieldIds::someOtherField},
        {"yetAnotherField", FieldIds::yetAnotherField},
    };
    ...
    /// pseudocode for the parser...
    Status parseProtected(BSONObj input) {
        for (auto&& elem : input) {
            auto iter = _fieldNames.find(elem.getFieldNameStringData());
            if (iter == _fieldNames.end()) continue;
            auto&& [name, id] = *iter;
            switch (id) {
                case FieldId::aField: ...
                case FieldId::someOtherField: ...
                case FieldId::yetAnotherField: ...
            }
        }
    }
};

From that point on, _fieldNames is accessed by Python to accept StringData name and return the FieldId of the field having that StringData as a name.

We can let the definition of exactly what FieldNameMap<T> is be an implementation detail in the idl_parser.h header. It's probably a thin wrapper around a FieldNameMap<size_t>, which is a thin wrapper around some kind of (TBD) PerfectlyHashedMap<StringData, size_t> or Trie<size_t>. It's pretty much a minimal associative unordered container's API, specifically made for this kind of search pattern (and const initialization) as an internal implementation choice.

We already have FieldIds defined inside the parse functions (see attached trie.txt file) but they're not in an enum yet. We could just do that change, assigning each struct field an enumeration in the closed enum set.



 Comments   
Comment by Billy Donahue [ 04/Dec/23 ]

(Recording for future reference)

I just wanted some class to use in the python code, but hadn't yet written the python side.

But now the Python side has been done and the results are interesting.
https://github.com/10gen/mongo/pull/17367

On ARM64 virtual workstation
BM_FIND_ONE_BSON/10:
Tot CPU
449 444      orig inline hardcoding
503 503      same hardcoded generator, but using function calls
527 527      entry-level trie implementation
567 567      StringMap<int>
619 619      std::vector<StringData> w/std::find (for reference)

Restructuring the code so that the hardoded generator is a function call and changing little else, already costs 503 - 444 = 59ns.

A rather simple dynamic implementation is not too much worse than hardcoding, adding only another 527 - 503 = 24ns.

Future work might be:

  • produce a constexpr trie with templates as generators. This would allow Python to generate the find functions with simple declarations, confining complexity to C++.
  • use a better trie than the entry-level one I wrote as an expositional reference
    (such as the CRC hash or double-array trie mentioned in SERVER-81876).
  • use string length as the first fanout discriminator in the lookup.
Comment by Amirsaman Memaripour [ 28/Nov/23 ]

Linking the hash-based PoC here, per triage discussion: SERVER-81876

Comment by Jason Chan [ 28/Nov/23 ]

billy.donahue@mongodb.com and george.wangensteen@mongodb.com also mentioned that continuing to rely on Python generation will impact compilation times.

Comment by Billy Donahue [ 28/Nov/23 ]

The generation in Python tends to make the generated code hard to work with. I see some possible inefficiencies with the generated parser but it's hard to experiment as-is to improve that situation.

Generated at Thu Feb 08 06:52:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.