-
Type: New Feature
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.2.1
-
Component/s: None
-
Labels:
-
Environment:- Ubuntu/precise
- mongodb-10gen 2.0.6
- Python 2.7.3
- Python 3.2.3
The feature implementation is designed according to PODLAMA, a
variation of POLA (BSD).
PODLAMA stands for
Principle Of Least Disturbance And Maximum Astonishment.
I.e., as long as the user does not know what is going on,
they will be least disturbed. Once they find out, they will be
maximally astonished (hence it is not abbreviated as POLDAMA).
Sorry, I could not resist. But now dead serious as promised.
Feature Description
The `enable_c` feature allows switching encoding/decoding functions
independently between C extension and pure Python implementations at
run-time.
It eliminates the need to install the pymongo module without the C
extension.
The feature implementation is available at
https://github.com/wolfmanx/mongo-python-driver/commit/072b2cac7a108ec4c7c55c4b0a120c00cb48afac
Deficiency
Currently, the `bson` C extension can only be disabled system-wide at
install time. This requires additional administrative measures like
e.g., `virtualenv` which are not readily available for all users.
Neither the C extension nor the Python implementation provide a
`default`/`object_pairs_hook` like the `json` module. This means that
the application is burdened to provide `clean` encodable data at all
times. In the worst case, the entire encoding process must be
duplicated by the application. In such a case, the entire performance
gain of the C extension is lost.
It is currently not possible for an application to `monkey-patch` the
Python implementation without either
- disabling the C extension system-wide
- or manuallly modifying `bson/_init_.py` after installation to gain
access to the Python implementation of `decode_all`, `_bson_to_dict`
and `_dict_to_bson`.
- or copying the python implementation of `decode_all`,
`_bson_to_dict`, `_dict_to_bson` into the application and installing
wrapped versions of these functions.
This situation is less than satisfactory.
Note: Monkey-patching is considered a valid programming technique in
Python and should not be made impossible. Also, monkey-patching
inherently implies that the developer assumes all risks.
*Benefits*
The `enable_c` feature allows the BSON implementation to be selected
application-wide at run-time. Hence the need to install pymongo
without the C extension system-wide is eliminated.
The Python implementations of the encoding/decoding functions become
accessible for monkey-patching in a controlled manner. Thus allowing
developers to implement encoding/decoding algorithms to their hearts'
content.
It is possible to select different implementations for encoding and
decoding, which allows to better optimize performance under varying
circumstances.
With the proper locking techniques, it is easy to reap the full
benefits of lightning-fast C extension encoding/decoding in some areas
without sacrificing the flexibility of Python in other areas.
Interface
- `bson.enable_c(True/False)`
enable/disable both encoding and decoding using the C extension
- `bson.enable_c_encoding(True/False)`
enable/disable encoding using the C extension
- `bson.enable_c_decoding(True/False)`
enable/disable decoding using the C extension
- `bson.is_c_enabled(both=True)`
Returns True, if both encoding and decoding use the C extension.
- `bson.is_c_enabled(both=False)`
Returns True, if al least one of encoding and decoding uses the C
extension.
Performance
The `enable_c` features adds an additional level of indirection to the
`decodeall` function, in order to allow `from bson import decodeall`
to work correctly. However, the decreased performance is considered
negligible compared to the benefits.
Regression Test
A unit test is included, which checks and exercises all
encoding/decoding combinations.
The correct behavior has been successfully tested under Ubuntu/precise
- mongodb-10gen 2.0.6
- Python 2.7.3
- Python 3.2.3
with
find -name '*.so' | xargs -r rm; python setup.py test
find -name '*.so' | xargs -r rm; python setup.py --without-c-ext test
find -name '*.so' | xargs -r rm; python3 setup.py test
find -name '*.so' | xargs -r rm; python3 setup.py --without-c-ext test
Risks
As long as the `enable_c` feature is not used, the `bson` module behaves
exactly as if the feature was not present. It therefore does not
present any risk to applications that are not aware of the feature.
Since the `enable_c` feature modifies module-level function objects at
run-time, any such objects that have been previously imported by other
modules are not affected in those modules.
They will not cease working, but they will continue to use the
functions as initially imported.
The implementation assumes, that only objects whose names do not begin
with an underscore should be imported with the `from bson import
<symbol>` idiom. Specifically, something like
from bson import _dict_to_bson
is considered bad coding practice and any bug reports involving such
imports in connection with the `enable_c` feature should be rejected as
invalid. This also reflects the consensus of the Python commmunity.
The pymongo/bson driver does not contain any such imports.
Note:: It is perfectly safe to access the affected functions through
module access, e.g.: `bson._dict_to_bson`.
Last Resort
In order to allow such code to still work, a configuration feature has
been implemented, that allows to choose the default implementation
before the module is actually loaded. In the following example the
imported function `_dict_to_bson` will reference the Python
implementation.
import bson.config bson.config.use_c_encoding_as_default = False bson.config.use_c_decoding_as_default = False import bson reload(bson) from bson import _dict_to_bson
The reload is necessary to make sure that the changed configuration
settings have the desired effect.
As a very last resort, the config.py file itself can be modified.
It is not recommend to advertise these configuration settings at all
or only with a very strong recommendation against their use.