-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Internal Code
-
Environment:ubuntu 14.04 / AWS EC2
-
Query Execution
-
ALL
-
Platforms 15 (06/03/16)
-
(copied to CRM)
with unique option index + 'korean' content
driver occur error when insert duplicate content
see below
cswcsy@niklane-Samsung-Ubuntu:~/crawlers/CrawlerPlatform/utils$ python mongo_test.py
<pymongo.results.InsertOneResult object at 0x7fba9819e500>
cswcsy@niklane-Samsung-Ubuntu:~/crawlers/CrawlerPlatform/utils$ python mongo_test.py
Traceback (most recent call last):
File "mongo_test.py", line 24, in <module>
result = col.insert_one(script)
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 625, in insert_one
bypass_doc_val=bypass_document_validation),
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 530, in _insert
check_keys, manipulate, write_concern, op_id, bypass_doc_val)
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 512, in _insert_one
check_keys=check_keys)
File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 218, in command
self._raise_connection_failure(error)
File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 346, in _raise_connection_failure
raise error
bson.errors.InvalidBSON: 'utf8' codec can't decode byte 0xeb in position 230: invalid continuation byte
cswcsy@niklane-Samsung-Ubuntu:~/crawlers/CrawlerPlatform/utils$
like above, it reproduced 100% when i insert twice time
in unique korean field.
it didn't reproduce when i use another content(korean)
here is my test code
-
-
-
- coding: utf-8
*
from pprint import pprint
from pymongo import ReplaceOne
from pymongo import InsertOne
import pymongo
from pymongo import MongoClient
from utils.mongomanager import MongoManager
from pymongo.errors import BulkWriteError
- coding: utf-8
-
-
mongo = MongoClient('localhost', 27017)
db = mongo['bigdata']
col = db['test']
script =
{'brand_name': u'\ub77c\uc628', 'category0': u'\uc0dd\ud65c/\uac74\uac15', 'category1': u'\uacf5\uad6c', 'category2': u'\ubaa9\uacf5\uacf5\uad6c', 'category3': u'\ub300\ud328', 'entity': [], 'price': 9300, 'title': u'\uad6c \uad6d\uc0b0 \ub300\ud328 \uc190\ub300\ud328 \ubaa9\uacf5\uacf5\uad6c \ubbf8\ub2c8\ub300\ud328 \ubaa8\uc11c\ub9ac\ub300\ud328 \ub300\ud328\ub0a0 \ubaa9\uacf5\uad6c \uc804\ub3d9\ub300\ud328 \ubaa9\uc218\uacf5\uad6c \ubaa9\uacf5\uc608 \ud648\ub300\ud328 DIY\uacf5\uad6c \ud3c9\uba74 \ub2e4\ub4ec\uae30'}result = col.insert_one(script)
pprint(result)
if need something more information or has some solution with this issue, plz reply me.
thanks a lot
- causes
-
RUBY-2560 EncodingError raised when server returns invalid UTF-8 in error messages derived from user input
- Backlog
-
DRIVERS-1936 Drivers should have option to disable UTF-8 validation for BSON strings
- Backlog
- is duplicated by
-
CDRIVER-2453 Invalid bson returned in bulk operation reply in some cases
- Closed
-
SERVER-55442 Server returns invalid utf-8 in duplicate key error message after truncating user input
- Closed
- is related to
-
COMPASS-8491 I keep getting and error when i try to use MongoDB compass
- Debugging With Submitter
-
NODE-3627 Getting "Invalid UTF-8 string in BSON document" instead on unique constraint error on bulkWrite.replaceOne
- Closed
-
SERVER-26050 Unique key violation for index with a non-simple collation has unclear error message
- Backlog
-
DRIVERS-2008 Default to lossy/replacement behavior when decoding UTF-8 in writeErrors
- Backlog
-
PYTHON-1090 Use 'replace' error handler when decoding write responses
- Closed
- related to
-
PYTHON-1682 Unicode errors from server are improperly encoded in exceptions
- Closed
-
RUST-886 Use Lossy UTF8 Decoding when decoding writeErrors returned from the server
- Closed
-
RUST-648 Decoding a a document with lossy utf8 conversion #226
- Closed