-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Environment:Linux 64bit, mongo 1.6.3 latest stable, pymongo fresh from git
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I found this out quite by accident, but it's possible to cause segfaults in the _cbson module because it claims to accept mappings, but it actually only accepts "dicts". Unfortunately, it'l accept anything that inherits from dicts too; for instance, it'l accept this:
class SegfaultingDict(dict):
def _iter_(self):
return iter(['fake', 'keys', 'dont', 'exist'])
This type of object will pass the PyDict_Check() in cbsonmodule.c:748(write_dict), since it _is indeed a dict, but when you create the iter, you get a bunch of keys that don't actually exist, so when you call `PyObject* value = PyDict_GetItem(dict, key);` later in the function, this returns NULL and you end up dereferencing a NULL pointer later down the line and go boom.
With my particular mapping, using PyObject_GetItem() rather than PyDict_GetItem() worked (I think because PyObject_GetItem will obey overrides of _getitem_ on the object; not sure about this), but this might have speed implications you don't want to deal with. So I guess there are probably 3 choices to avoid the segfault:
1. change PyDict_Check() to PyDict_CheckExact() (added in python 2.4), which will fail on subclasses, and then do whatever you want to the user; either fail with an exception, or perhaps in the C wrapper module you can catch a particular exception and fallback to the python version, which handles any mapping
2. change _dict_to_bson in the bson python module to use a wrapper around _cbson._dict_to_bson that coerces mappings to dicts explicitly, so the C module only ever gets true dicts that won't misbehave
3. allow a similarly sophisticated definition of 'mapping' as in python in the cbson C module; ie, use Pyobject_GetItem, and then check the return value for NULL to avoid any shenanigans where a mapping returns keys that raise KeyError (or otherwise return some kind of NULL) when __getitem_ed.
– Jason
(P.S. Hey mike)