[SERVER-18248] < 100 chars are still too large to index if weird chars or messed up encoding…? Created: 29/Apr/15 Updated: 26/May/15 Resolved: 26/May/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MapReduce |
| Affects Version/s: | 3.0.2 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Viktor Hedefalk | Assignee: | Sam Kleinman (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
Hi, I'm doing a map/reduce to collect a large set of log entries of users input. When moving from 2.4 or something to 3.0.2 a lot of problems of too long fields to index appeared. We then cut the large ones down to prefixes and it seemed to work. When doing this map/reduce however I get: 2015-04-29T09:40:46.014+0200 E QUERY Error: map reduce failed:{ But these fields are less than 100 chars: '}}).count() We have a lot of these cases with weird encodings, I think this one is the beginning of swedish "2 ägg" which means "2 eggs" My guess is that mongo does some internal tree encoding which makes these unusual characters take up a looot of space so the overhead makes < 100 chars more than 1024 bytes. What can I do? I could probably go with losing these log entries, but I really don't even know how to identify them all? |
| Comments |
| Comment by Sam Kleinman (Inactive) [ 26/May/15 ] |
|
I'm glad that you've been able to resolve this, and sorry for the confusion. I'm going to go ahead and close this ticket. Feel free to reopen if you run into this again or open a new ticket as needed. Cheers, |
| Comment by Viktor Hedefalk [ 21/May/15 ] |
|
Hi @Ramon, I could get round it by wiping my mongo installation. Seems like some temps crap staying behind even though I had removed the failing data. "kostbevakningen.tmp.mr.logentrys_0_inc.$_temp_0 1057" sounds like something else than the "real" data. |
| Comment by Ramon Fernandez Marina [ 21/May/15 ] |
|
Hi hedefalk, we haven't heard back from you for some time. If this is still an issue for you can you please answer Sam's question above about a reproducer? Thanks, |
| Comment by Sam Kleinman (Inactive) [ 07/May/15 ] |
|
Hello, Thanks for reporting this is issue. Can you provide sample data data and/or a small script that we could use to reproduce the issue? This will help us understand the problem much more clearly. Regards, |