Details

    • Type: Improvement Improvement
    • Status: Resolved Resolved
    • Priority: Major - P3 Major - P3
    • Resolution: Fixed
    • Affects Version/s: 0.42
    • Fix Version/s: 0.701.4
    • Component/s: None
    • Labels:
      None
    • Environment:
      debian on a xeon E5405 (8 core) with 16G of memory and SSD disk
    • # Replies:
      4
    • Last comment by Customer:
      true

      Description

      Perl driver is very, very, very slow compared to PHP one...

      Doing some tests with perl 5.10.1 and perl 5.12.3 show that PHP is between 2.8 to more than 3.2 times faster for basic insertions (without or with threads enabled).

      In PHP case, mongod reach 100% and PHP stay around 57% CPU. 10M insertions done in ~100 seconds.

      In Perl case, mongodb stay around 76% while Perl is at 100% CPU ! 10M insertions done in ~280 seconds (threaded perl) or ~320 seconds (original, debian threaded perl).

      I join the 2 sample scripts and a little patch that avoid calling sprintf() 12 times for each insertion. In the test case, 10% of exec time was spent in sprintf. With it we gain 10% of total time, not much, but better than nothing...

      1. insert.php
        0.4 kB
        Maxime Soulé
      2. insert.pl
        0.6 kB
        Maxime Soulé
      3. without-sprintf.patch
        0.7 kB
        Maxime Soulé

        Activity

        Hide
        Kristina Chodorow (Inactive)
        added a comment -

        I can definitely get rid of the sprintf! It seems like Perl should be at least as fast as the PHP driver, I'll run some tests and see if there's anything else that can be trimmed.

        Show
        Kristina Chodorow (Inactive)
        added a comment - I can definitely get rid of the sprintf! It seems like Perl should be at least as fast as the PHP driver, I'll run some tests and see if there's anything else that can be trimmed.
        Hide
        Maxime Soulé
        added a comment -

        I did more tests with valgrind/callgrind and it appears that Perl_hv_common is called more than 5M times! for 100K insertions. It is the internal perl function for hash manipulation.

        When investigating a little, I saw that get_sv is over used (1M times). get_sv uses Perl_hv_common to access the symbol table.

        For example using get_sv("$", 0) is less efficient than using getpid() directly (on linux at least) in perl_mongo_make_id.

        I think all other occurrences of get_sv should be reworked, to access directly C global variables instead of perl ones. MongoDB::BSON::char in perl_mongo_serialize_key for example is called very often: 800K times in my case...

        I think that only the modification of the variables should occur in perl, not their access...

        Note that strlen("xxx") can be replaced by (sizeof("xxx")-1) to be computed at compilation stage. strlen() is called 5M times too.

        Thank you very much for your work!

        Show
        Maxime Soulé
        added a comment - I did more tests with valgrind/callgrind and it appears that Perl_hv_common is called more than 5M times! for 100K insertions. It is the internal perl function for hash manipulation. When investigating a little, I saw that get_sv is over used (1M times). get_sv uses Perl_hv_common to access the symbol table. For example using get_sv("$", 0) is less efficient than using getpid() directly (on linux at least) in perl_mongo_make_id. I think all other occurrences of get_sv should be reworked, to access directly C global variables instead of perl ones. MongoDB::BSON::char in perl_mongo_serialize_key for example is called very often: 800K times in my case... I think that only the modification of the variables should occur in perl, not their access... Note that strlen("xxx") can be replaced by (sizeof("xxx")-1) to be computed at compilation stage. strlen() is called 5M times too. Thank you very much for your work!
        Hide
        D. Ilmari Mannsåker
        added a comment - - edited

        I've implemented these suggestions on https://github.com/ilmari/mongo-perl-driver/tree/speedup

        The changes take the insert.pl script from 263 seconds (38017 inserts/second) to 146 seconds (68615 inserts/second)

        Stashing pointers to the variables in C globals breaks using 'local' on them, though, since that temporarily creates a new SV in the symbol table. A compromise could be to refresh them on entry to API methods that use them, but that adds maintenance overhead.

        Show
        D. Ilmari Mannsåker
        added a comment - - edited I've implemented these suggestions on https://github.com/ilmari/mongo-perl-driver/tree/speedup The changes take the insert.pl script from 263 seconds (38017 inserts/second) to 146 seconds (68615 inserts/second) Stashing pointers to the variables in C globals breaks using 'local' on them, though, since that temporarily creates a new SV in the symbol table. A compromise could be to refresh them on entry to API methods that use them, but that adds maintenance overhead.
        Hide
        D. Ilmari Mannsåker
        added a comment -

        I've fixed the 'local' issue, which reduces the performance somewhat, but it's still significantly faster than originally. See https://github.com/mongodb/mongo-perl-driver/pull/66 for details.

        Show
        D. Ilmari Mannsåker
        added a comment - I've fixed the 'local' issue, which reduces the performance somewhat, but it's still significantly faster than originally. See https://github.com/mongodb/mongo-perl-driver/pull/66 for details.

          People

          • Votes:
            8 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:
              Days since reply:
              49 weeks, 1 day ago
              Date of 1st Reply: