[SERVER-1555] Add UTF-8 Validation Option on mongodump and mongorestore Created: 03/Aug/10 Updated: 29/May/12 Resolved: 17/Nov/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 1.4.4 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | J. Gray | Assignee: | Kristina Chodorow (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu @ EC2 |
||
| Participants: |
| Description |
|
Recently made database dump that I was apparently able to reimport but which gave me a "Invalid UTF-8 Data" error when I tried to query it upon restoration. The dump or the restoration failed, but I don't know which because I can't tell if the db dump resulted in a file filled with valid data. Similarly, I couldn't tell if the restoration had succeeded whenever I went to restore the data. |
| Comments |
| Comment by Kristina Chodorow (Inactive) [ 17/Nov/10 ] |
|
Please comment if this is still a problem for you. |
| Comment by Eliot Horowitz (Inactive) [ 29/Sep/10 ] |
|
any more info on this? |
| Comment by Eliot Horowitz (Inactive) [ 03/Aug/10 ] |
|
there is a bsondump utility if you compile from master. it will be included in 1.6.0 |
| Comment by J. Gray [ 03/Aug/10 ] |
|
The shell is where I'm seeing this error; I've been writing data via php and then reading it at the shell so I can have an idea of what the query should look like before putting it on to the web with php. My understanding is that the php driver that I'm using doesn't allow the insertion of bad data. What do you think would cause this error if not dump/restore? I asked at 10gen office hours last week if there was any way to view/edit a dump file and got the impression there's not (actually, they did say something about a ruby-based viewer that doesn't allow editing), so I guess I'm looking for suggestions on how to verify that my dump file is good before deleting my db and/or how to verify that a dump file will restore properly. Please let me know if anything comes to mind. |
| Comment by Eliot Horowitz (Inactive) [ 03/Aug/10 ] |
|
The odds of it being dump/restore are incredibly small. |
| Comment by J. Gray [ 03/Aug/10 ] |
|
I don't have the original database anymore so cannot confirm it didn't have the same issue, but I was only loading data in with the php driver and never experienced the error message, leading me to speculate that I didn't have this issue prior to the dump. Given that I had the dbpath and dump on the same 10GB EBS volume, it seems most likely that I wrote a dump file that ended unexpectedly or that the dump file restoration ended unexpectedly, and I'm hoping to have a means by which to determine whether it's the dump file or the restoration that leads to the "Invalid UTF-8" error. |
| Comment by Eliot Horowitz (Inactive) [ 03/Aug/10 ] |
|
Are you sure the original database doesn't have the same issue? We don't modify bson in/out - so i don't think this is a dump/restore issue. what driver are you using? |