[SERVER-4329] uncaught exception in mapreduce causes mongod to terminate Created: 19/Nov/11 Updated: 11/Jul/16 Resolved: 21/Nov/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MapReduce |
| Affects Version/s: | None |
| Fix Version/s: | 2.1.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Antoine Girbal | Assignee: | Antoine Girbal |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
Root cause: Tue Nov 15 08:00:21 [conn202500] ERROR: Uncaught std::exception: could not initialize cursor across all shards because : socket exception @ prod1/aboutmemng-m01.db.aol.com:27017,aboutmemng-d01.db.aol.com:27017,ec2-184-73-54-43.compute-1.amazonaws.com:27017, terminating Looks like map/reduce isn't handling certain socket errors correctly. |
| Comments |
| Comment by Antoine Girbal [ 21/Nov/11 ] |
|
This commit should also prevent db termination commit 617e9ff8ec1ecd134c7e6e42c85983ff8873a30d Catch DBException separate from std::exception |
| Comment by Antoine Girbal [ 21/Nov/11 ] |
|
actually this is fixed by this commit: commit d869bd9bb787707eefd650c6b59ecfdd2686d9d4 try/catch around all command calls |
| Comment by Antoine Girbal [ 21/Nov/11 ] |
|
actually it's easy to reproduce with the 2.0 line. I could not get the termination from HEAD, so have to figure out what change fixed it. |
| Comment by Antoine Girbal [ 19/Nov/11 ] |
|
I cannot reproduce this issue easily. commit 4d8ee4cc7c4d32ace1b1cab403dd429d9467a677 parallel cursor recover gracefully from replica set and other errors |
| Comment by Antoine Girbal [ 19/Nov/11 ] |
|
try/catch can easily be added to that spot, but it would be better to have a general solution to avoid missing try/catch from terminating mongod. |