[SERVER-9099] Multiple failures when creating a collection using international character sets Created: 23/Mar/13  Updated: 25/Mar/13  Resolved: 25/Mar/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.2.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Brandon Black Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-4412 Some Command don't support Chinese Closed
Related
Operating System: OS X
Steps To Reproduce:

See description.

Participants:

 Description   

Creating a collection with Japanese characters in the name causes a few different odd failures to occur. Below are are my results from MongoDB 2.2.3 using the mongo shell to illustrate the issue.

In the first attempt, mongo will raise an invalid namespace exception:

(mongod-2.2.3) test2> db.createCollection('すでに存在してい')
{
  "errmsg": "exception: invalid ns: test2.すでに存在してい.$_id_",
  "code": 10094,
  "ok": 0
}

On the second attempt, mongo shows an error stating that the collection already exists even though it wasn't created successfully in the previous attempt:

(mongod-2.2.3) test2> db.createCollection('すでに存在してい')
{
  "errmsg": "collection already exists",
  "ok": 0
}
(mongod-2.2.3) test2> show collections
system.indexes

However, Japanese characters proceeded by letters or numbers works fine:

(mongod-2.2.3) test> db.createCollection('THIS_WORKS_すでに存在してい')
{
  "ok": 1
}
(mongod-2.2.3) test> db.createCollection('999_SO_DOES_THISすでに存在してい')
{
  "ok": 1
}

I've validated that the same behavior exists on 2.4.0 as well. In addition to the shell examples from above, this behavior is the same when interacting with MongoDB through the Ruby driver and the Python driver.

Even if this isn't something we support for a valid reason, it seems like we're not handling it correctly given the error messages being shown.



 Comments   
Comment by Tad Marshall [ 23/Mar/13 ]

Thanks, Brandon. I marked SERVER-4412 as fixed in 2.3.2 as well.

Comment by Brandon Black [ 23/Mar/13 ]

You're both right. I thought I had run these tests in 2.4.0 and seen the same but I must have been mistaken. I've just tried this in 2.4.0 and I can confirm its been fixed.

Comment by Tad Marshall [ 23/Mar/13 ]

Randolph is exactly right that this is SERVER-4412. But this was actually fixed for version 2.3.2 by commit e7069c1c980a0465ef64dd108a78346a037dd81e.

I can't reproduce the problem in the current master branch:

MongoDB shell version: 2.5.0-pre-
connecting to: test
Server has startup warnings:
Sat Mar 23 15:50:14.272 [initandlisten]
Sat Mar 23 15:50:14.273 [initandlisten] ** NOTE: This is a development version (2.5.0-pre-) of MongoDB.
Sat Mar 23 15:50:14.274 [initandlisten] **       Not recommended for production.
Sat Mar 23 15:50:14.274 [initandlisten]
> db.createCollection('すでに存在してい')
{ "ok" : 1 }
> db.createCollection('っっじいいじいい')
{ "ok" : 1 }
> db.createCollection('ㄴㅇㅁㅇㅉㅎㅎㄴ마ㅏㅓㅏ')
{ "ok" : 1 }
> db.createCollection('ррфвыовфывооорв')
{ "ok" : 1 }
> db.getCollectionNames()
[
        "system.indexes",
        "ррфвыовфывооорв",
        "すでに存在してい",
        "っっじいいじいい",
        "ㄴㅇㅁㅇㅉㅎㅎㄴ마ㅏㅓㅏ"
]
>

Can you double-check 2.4.0?

Comment by Brandon Black [ 23/Mar/13 ]

My initial tests were done using a copy and paste from translated text in my browser, but here's the same tests and results using direct keyboard input in Mac OS X. I'm also dropping the entire database between test sets. We seem to struggle with international character sets in general.

Kotoeri (Japanese):

(mongod-2.2.3) test> db.createCollection('っっじいいじいい')
{
  "errmsg": "exception: invalid ns: test.っっじいいじいい.$_id_",
  "code": 10094,
  "ok": 0
}
(mongod-2.2.3) test> db.createCollection('っっじいいじいい')
{
  "errmsg": "collection already exists",
  "ok": 0
}
(mongod-2.2.3) test> db.createCollection('000っっじいいじいい')
{ "ok" : 1 }
(mongod-2.2.3) test> db.createCollection('AAAっじいいじいい')
{ "ok" : 1 }

Hangul (Korean):

(mongod-2.2.3) test> db.createCollection('ㄴㅇㅁㅇㅉㅎㅎㄴ마ㅏㅓㅏ')
{
   "errmsg" : "exception: invalid ns: test.ㄴㅇㅁㅇㅉㅎㅎㄴ마ㅏㅓㅏ.$_id_",
   "code" : 10094,
   "ok" : 0
}
(mongod-2.2.3) test> db.createCollection('ㄴㅇㅁㅇㅉㅎㅎㄴ마ㅏㅓㅏ')
{ 
  "errmsg" : "collection already exists",
  "ok" : 0
}
(mongod-2.2.3) test> db.createCollection('AAAAっっじいいじいい')
{ "ok" : 1 }
(mongod-2.2.3) test> db.createCollection('9999っじいいじいい')
{ "ok" : 1 }

Russian:

(mongod-2.2.3) test> db.createCollection('ррфвыовфывооорв')
{
   "errmsg" : "exception: invalid ns: test.ррфвыовфывооорв.$_id_",
   "code" : 10094,
   "ok" : 0
}
(mongod-2.2.3) test> db.createCollection('ррфвыовфывооорв')
{ 
   "errmsg" : "collection already exists",
   "ok" : 0 
}
(mongod-2.2.3) test> db.createCollection('999ррфвыовфывооорв')
{ "ok" : 1 }
(mongod-2.2.3) test> db.createCollection('AAAррфвыовфывооорв')
{ "ok" : 1 }

Comment by Randolph Tan [ 23/Mar/13 ]

Most operating systems have keyboard settings for configuring input methods for Asian characters. But if you don't want to bother with that you can simply copy and paste since it works. http://translate.google.com is perfect for this, just input and English word and copy and paste the translated text.

Comment by Tad Marshall [ 23/Mar/13 ]

Which operating systems did you test this on? The ticket says "ALL".

How did you generate the Japanese collection names (e.g. copy and paste from <name of> application, typed them in using hex Unicode codes, used Chinese keyboard Input Method Editor, other)?

Generated at Thu Feb 08 03:19:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.