[SERVER-1056] utf8 error causes Segmentation fault while deleting a Map/Reduce temporary collection Created: 27/Apr/10  Updated: 12/Jul/16  Resolved: 28/Apr/10

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 1.4.1
Fix Version/s: 1.5.1

Type: Bug Priority: Critical - P2
Reporter: Nicolas Fouché Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

fedora 9 x86_64


Attachments: File mr_function.rb     Text File shard1-001.txt     Text File shard1-002.txt    
Participants:

 Description   

Replica-pair setup: production-shard1-002 and production-shard1-002.silentale.net
Before the first segmentation fault, shard1-002 is the master.

I launch a Map/Reduce command with the Ruby driver. The command fails, I receive this exception: #<Mongo::OperationFailure: map-reduce failed: assertion: invalid utf8>.
Then I launch collection.count, I get a proper result.
And finally, I launch the same Map/Reduce command, but with less data (thanks to a query), and mongod crashes. I know that this last Map/Reduce command behaves correctly when launched on a freshly started server.

Tue Apr 27 14:05:25 CMD: drop random_db_name.tmp.mr.mapreduce_1272377125_2
Tue Apr 27 14:05:25 CMD: drop random_db_name.tmp.mr.mapreduce_1272377125_2_inc
Tue Apr 27 14:05:25 Got signal: 11 (Segmentation fault).
Tue Apr 27 14:05:25 Backtrace:
0x6a8309 0x33042322a0 0x7400f9 0x7a02c3 0x79dbbb 0x71388b 0x713b5a 0x714c4f 0x714bdc 0x714ac5 0x54d691 0x54f433 0x649bdc 0x67e441 0x67f347 0x55662f 0x55a6cc 0x5f310a 0x5f8137 0x6a9894

The sad part is that I did not notice that the master crashed (auto-reconnect), and I crashed the second pair right after

I attached the logs of the two servers, and the Map/Reduce code in a Ruby file.

I can reproduce the problem on my laptop (MongoDB 1.4.1 64bits OSX) with the same collection, and a single mongod instance. But it does not raise a segmentation fault, mongod just gently kills itself.

Tue Apr 27 18:19:09 connection accepted from 127.0.0.1:60353 #1
Tue Apr 27 18:19:09 connection accepted from 127.0.0.1:60354 #2
Tue Apr 27 18:19:09 CMD: drop random_db_name.tmp.mr.mapreduce_1272385149_1
Tue Apr 27 18:19:09 CMD: drop random_db_name.tmp.mr.mapreduce_1272385149_1_inc
decode failed. probably invalid utf-8 string ["c?cilegigi63"@orange.fr]
why: TypeError: malformed UTF-8 character sequence at offset 2
Tue Apr 27 18:19:10 mr failed, removing collection
Tue Apr 27 18:19:10 CMD: drop random_db_name.tmp.mr.mapreduce_1272385149_1
Tue Apr 27 18:19:10 CMD: drop random_db_name.tmp.mr.mapreduce_1272385149_1_inc
Tue Apr 27 18:19:10 query random_db_name.$cmd ntoreturn:1 command reslen:89 641ms
Tue Apr 27 18:19:15 CMD: drop random_db_name.tmp.mr.mapreduce_1272385155_2
Tue Apr 27 18:19:15 CMD: drop random_db_name.tmp.mr.mapreduce_1272385155_2_inc
Assertion failure: !oldfp->dormantNext, at jsinterp.c:1620
Tue Apr 27 18:19:15 Got signal: 6 (Abort trap).
Tue Apr 27 18:19:15 Backtrace:
0x10021983c 0x7fff800693fa 0x1f6 0x10021a1b5 0x10021f21d 0x10021f56d 0x10022066f 0x100000b74
0 mongod 0x000000000021983c _ZN5mongo10abruptQuitEi + 332
1 libSystem.B.dylib 0x00000000800693fa _sigtramp + 26
2 ??? 0x00000000000001f6 0x0 + 502
3 mongod 0x000000000021a1b5 _ZN5mongo6listenEi + 597
4 mongod 0x000000000021f21d _ZN5mongo14_initAndListenEiPKc + 1693
5 mongod 0x000000000021f56d _ZN5mongo13initAndListenEiPKc + 29
6 mongod 0x000000000022066f main + 3583
7 mongod 0x0000000000000b74 start + 52
Tue Apr 27 18:19:15 dbexit:
Tue Apr 27 18:19:15 shutdown: going to close listening sockets...
Tue Apr 27 18:19:15 going to close listening socket: 7
Tue Apr 27 18:19:15 Listener on port 28017 aborted.
Tue Apr 27 18:19:15 going to close listening socket: 8
Tue Apr 27 18:19:15 shutdown: going to flush oplog...
Tue Apr 27 18:19:15 shutdown: going to close sockets...
Tue Apr 27 18:19:15 shutdown: waiting for fs preallocator...
Tue Apr 27 18:19:15 MessagingPort recv() errno:9 Bad file descriptor 127.0.0.1:60353
Tue Apr 27 18:19:15 shutdown: closing all files...
Tue Apr 27 18:19:15 end connection 127.0.0.1:60353
Tue Apr 27 18:19:15 closeAllFiles() finished
Tue Apr 27 18:19:15 shutdown: removing fs lock...
Tue Apr 27 18:19:15 dbexit: really exiting now

So I suppose that this problems appears as soon as a previous map/reduce command fails.



 Comments   
Comment by Nicolas Fouché [ 29/Apr/10 ]

Thanks, with the nightly build, I can rescue the errors in Javascript, and ignore them until I fix the documents in our database. Here are the logs I get:

Thu Apr 29 10:26:14 connection accepted from 127.0.0.1:62770 #9
Thu Apr 29 10:26:14 connection accepted from 127.0.0.1:62771 #10
Thu Apr 29 10:26:14 CMD: drop veronica_production.tmp.mr.mapreduce_1272529574_8
Thu Apr 29 10:26:14 CMD: drop veronica_production.tmp.mr.mapreduce_1272529574_8_inc
decode failed. probably invalid utf-8 string ["c?cilegigi63"@orange.fr]
why: TypeError: malformed UTF-8 character sequence at offset 2
=== Error catched by the JS map function: Error, for document: 4ad044b00f8bc5399300002c ===
decode failed. probably invalid utf-8 string ["c?cilegigi"@orange.fr]
why: TypeError: malformed UTF-8 character sequence at offset 2
=== Error catched by the JS map function: Error, for document: 4ad044b00f8bc5399300002c ===
Thu Apr 29 10:26:16 building new index on

{ 0: 1 }

for veronica_production.tmp.mr.mapreduce_1272529574_8_inc
Thu Apr 29 10:26:16 Buildindex veronica_production.tmp.mr.mapreduce_1272529574_8_inc idxNo:0 { ns: "veronica_production.tmp.mr.mapreduce_1272529574_8_inc", key:

{ 0: 1 }

, name: "0_1" }
Thu Apr 29 10:26:17 done for 7010 records 0.045secs

I still have a question: what about the segmentation fault problem ? Now that we can rescue this particular utf-8 problem in Javascript, what about other problems that could occur ? Would mongod still crash ?

Comment by auto [ 28/Apr/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: handle utf8 errors better in js, try to convert to js error SERVER-1063 SERVER-1056
http://github.com/mongodb/mongo/commit/071f4015b4902bf17c0be268e38c0a26d4a3d29a

Comment by Nicolas Fouché [ 28/Apr/10 ]

If a throw a Javascript exception in my map method, logs show:

Wed Apr 28 12:15:22 JS Error: uncaught exception: blah

And I can launch any M/R I want after that. So the problem seems to occur only when a exception is launched in the C code. In this case, it's in engine_spidermonkey.cpp, and I cannot catch it in Javascript (with a try/catch block).

throw UserException( 9006 , "invalid utf8" );

Comment by Nicolas Fouché [ 27/Apr/10 ]

By the way, if you know a solution to the first Map/Reduce fail, "invalid utf-8 string", I take it Because I'm stuck until I can make this M/R command pass.

Comment by Nicolas Fouché [ 27/Apr/10 ]

Forgot to update the subject of the ticket, it's not quite exact. I let you change it to something like: "Segmentation fault if a former Map/Reduce query fails".

Generated at Thu Feb 08 02:55:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.