Allow enabling/disabling of BSON C-extension at run-time

XMLWordPrintableJSON

    • Type: New Feature
    • Resolution: Won't Fix
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.2.1
    • Component/s: None
    • Environment:
      - Ubuntu/precise
      - mongodb-10gen 2.0.6
      - Python 2.7.3
      - Python 3.2.3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The feature implementation is designed according to PODLAMA, a
      variation of POLA (BSD).
      PODLAMA stands for
      Principle Of Least Disturbance And Maximum Astonishment.
      I.e., as long as the user does not know what is going on,
      they will be least disturbed. Once they find out, they will be
      maximally astonished (hence it is not abbreviated as POLDAMA).

      Sorry, I could not resist. But now dead serious as promised.

      Feature Description

      The `enable_c` feature allows switching encoding/decoding functions
      independently between C extension and pure Python implementations at
      run-time.

      It eliminates the need to install the pymongo module without the C
      extension.

      The feature implementation is available at
      https://github.com/wolfmanx/mongo-python-driver/commit/072b2cac7a108ec4c7c55c4b0a120c00cb48afac

      Deficiency

      Currently, the `bson` C extension can only be disabled system-wide at
      install time. This requires additional administrative measures like
      e.g., `virtualenv` which are not readily available for all users.

      Neither the C extension nor the Python implementation provide a
      `default`/`object_pairs_hook` like the `json` module. This means that
      the application is burdened to provide `clean` encodable data at all
      times. In the worst case, the entire encoding process must be
      duplicated by the application. In such a case, the entire performance
      gain of the C extension is lost.

      It is currently not possible for an application to `monkey-patch` the
      Python implementation without either

      • disabling the C extension system-wide
      • or manuallly modifying `bson/_init_.py` after installation to gain
        access to the Python implementation of `decode_all`, `_bson_to_dict`
        and `_dict_to_bson`.
      • or copying the python implementation of `decode_all`,
        `_bson_to_dict`, `_dict_to_bson` into the application and installing
        wrapped versions of these functions.

      This situation is less than satisfactory.

      Note: Monkey-patching is considered a valid programming technique in
      Python and should not be made impossible. Also, monkey-patching
      inherently implies that the developer assumes all risks.

      *Benefits*

      The `enable_c` feature allows the BSON implementation to be selected
      application-wide at run-time. Hence the need to install pymongo
      without the C extension system-wide is eliminated.

      The Python implementations of the encoding/decoding functions become
      accessible for monkey-patching in a controlled manner. Thus allowing
      developers to implement encoding/decoding algorithms to their hearts'
      content.

      It is possible to select different implementations for encoding and
      decoding, which allows to better optimize performance under varying
      circumstances.

      With the proper locking techniques, it is easy to reap the full
      benefits of lightning-fast C extension encoding/decoding in some areas
      without sacrificing the flexibility of Python in other areas.

      Interface

      • `bson.enable_c(True/False)`
        enable/disable both encoding and decoding using the C extension
      • `bson.enable_c_encoding(True/False)`
        enable/disable encoding using the C extension
      • `bson.enable_c_decoding(True/False)`
        enable/disable decoding using the C extension
      • `bson.is_c_enabled(both=True)`
        Returns True, if both encoding and decoding use the C extension.
      • `bson.is_c_enabled(both=False)`
        Returns True, if al least one of encoding and decoding uses the C
        extension.

      Performance

      The `enable_c` features adds an additional level of indirection to the
      `decodeall` function, in order to allow `from bson import decodeall`
      to work correctly. However, the decreased performance is considered
      negligible compared to the benefits.

      Regression Test

      A unit test is included, which checks and exercises all
      encoding/decoding combinations.

      The correct behavior has been successfully tested under Ubuntu/precise

      • mongodb-10gen 2.0.6
      • Python 2.7.3
      • Python 3.2.3

      with

      find -name '*.so' | xargs -r rm; python setup.py test
      find -name '*.so' | xargs -r rm; python setup.py --without-c-ext test
      find -name '*.so' | xargs -r rm; python3 setup.py test
      find -name '*.so' | xargs -r rm; python3 setup.py --without-c-ext test

      Risks

      As long as the `enable_c` feature is not used, the `bson` module behaves
      exactly as if the feature was not present. It therefore does not
      present any risk to applications that are not aware of the feature.

      Since the `enable_c` feature modifies module-level function objects at
      run-time, any such objects that have been previously imported by other
      modules are not affected in those modules.

      They will not cease working, but they will continue to use the
      functions as initially imported.

      The implementation assumes, that only objects whose names do not begin
      with an underscore should be imported with the `from bson import
      <symbol>` idiom. Specifically, something like

      from bson import _dict_to_bson
      

      is considered bad coding practice and any bug reports involving such
      imports in connection with the `enable_c` feature should be rejected as
      invalid. This also reflects the consensus of the Python commmunity.

      The pymongo/bson driver does not contain any such imports.

      Note:: It is perfectly safe to access the affected functions through
      module access, e.g.: `bson._dict_to_bson`.

      Last Resort

      In order to allow such code to still work, a configuration feature has
      been implemented, that allows to choose the default implementation
      before the module is actually loaded. In the following example the
      imported function `_dict_to_bson` will reference the Python
      implementation.

      import bson.config
      bson.config.use_c_encoding_as_default = False
      bson.config.use_c_decoding_as_default = False
      import bson
      reload(bson)
      from bson import _dict_to_bson
      

      The reload is necessary to make sure that the changed configuration
      settings have the desired effect.

      As a very last resort, the config.py file itself can be modified.

      It is not recommend to advertise these configuration settings at all
      or only with a very strong recommendation against their use.

            Assignee:
            Unassigned
            Reporter:
            Wolfgang Scherer
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: