Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-192

document behaviour wrt Unicode



    • Type: Task
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0
    • Component/s: None
    • Labels:
    • Environment:
      Python (platform-independent)


      In July 2009, Michael Dirolf wrote about the behaviour regarding strings/Unicode strings of the Python driver.

      "This issue is a bit tricky because of Python's (at least Python < 3.0) string vsunicodevs binary data handling. The reason the driver converts everything tounicodeis because the database stores utf-8. So when you save a regular string it is assumed to be utf-8 data and saved as such. When you save aunicodeinstance it is encoded to utf-8 and saved. On the decoding end we then just convert everything back tounicodebecause that is sort of the lowest common denominator for the different datatypes that could have been saved originally. "

      As far as I can tell this is not explicitly documented? In the PyMongo v1.9+ documentation there is no mention of this behaviour - or am I overlooking something?
      Proposal: document this clearly in the tutorial. The examples clearly show that you enter normal Python text strings but get Unicode strings back from the database,
      but a sentence or two to emphasize it would be in place.




            • Votes:
              0 Vote for this issue
              0 Start watching this issue


              • Created: