Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-355

UnicodeEncodeError on pickle.loads

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.2.1
    • Affects Version/s: 2.2
    • Component/s: None
    • Environment:
      Windows 7, MongoDB 2.0.3, Python 2.7, pymongo 2.2, Django 1.3, Piston 0.2.3, mango (https://github.com/vpulim/mango) 0.1, 127.0.0.1:8000
    • Fully Compatible

      I'm using mango.session, witch stores django sessions in mongodb.
      Storing sessions works ok.
      But when i'm trying to restore session it comes empty.
      When debugging into deeps, found that pickle.loads(pickled) - session_data throws exception UnicodeEncodeError:

      Traceback (most recent call last):
        File "C:\Python27\lib\site-packages\pymongo-2.2-py2.7-win32.egg\bson\objectid.py", line 223, in __setstate__
          self.__id = value.encode('latin-1')
      

      UnicodeEncodeError: 'latin-1' codec can't encode characters in position 3-4: ordinal not in range(256)

      Then i tried to eliminate error and added to except clause UnicodeEncodeError (to existing UnicodeDecodeError and AttributeError).
      Works like a charm.

      Then I tried to reproduce this behavior in IDLE and had no luck. Only see UnicodeDecodeError with same data as I get from db.
      Pickled string looked like:

      '\x80\x02}q\x01(U\x12_auth_user_backendq\x02U\x12mango.auth.Backendq\x03U\r_auth_user_idq\x04cbson.objectid\nObjectId\nq\x05)\x81q\x06U\x0cO\xad\xb5\xf7\xbcu\x9e\x10p\x00\x00\x00q\x07bu.'
      

      Here truncated code from django.contrib.sessions.backends.base.py SessionBase.load:

      from django.utils.encoding import force_unicode
      import base64
      import pickle #django imports cPickle, both work same way
      encoded_data = base64.decodestring(force_unicode('MWYxZWIxYWI3YTg3M2YyNTlmZmQxOWI3NTIzMDRmZjlkYTU5ZDBjYjqAAn1xAShVEl9hdXRoX3Vz\n\
      ZXJfYmFja2VuZHECVRJtYW5nby5hdXRoLkJhY2tlbmRxA1UNX2F1dGhfdXNlcl9pZHEEY2Jzb24u\n\
      b2JqZWN0aWQKT2JqZWN0SWQKcQUpgXEGVQxPrbX3vHWeEHAAAABxB2J1Lg==')) #this string came from mongodb sessions collection as session_data
      hash, pickled = encoded_data.split(':', 1) #salt checks pass behind the scene
      pickle.loads(pickled)
      

      IDLE with this code works ok (UnicodeDecodeError supressed) and result dict as expected:

      {'_auth_user_id': ObjectId('4fadb5f7bc759e1070000000'), '_auth_user_backend': 'mango.auth.Backend'}
      

      But when this run in django app i see that int _setstate__ as value passed some binary value (see screenshot from eclipse debug variables) witch throws UnicodeEncodeError when trying to convert it to latin-1. And self.__id = value in except block (if I add UnicodeEncodeError to except) works right.

      I'm a newbie to python and mongodb, maybe that is my mistake. Thanx.


      See also: https://github.com/mongodb/mongo-python-driver/commit/7474f5cde80fb0bd0e4a1bf4cbb88b4902131810#commitcomment-1354855

            Assignee:
            bernie@mongodb.com Bernie Hackett
            Reporter:
            ivanovsuper Yura Ivanov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: