[SERVER-13772] Detached map-reduce jobs Created: 28/Apr/14 Updated: 06/Dec/22 Resolved: 01/Feb/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MapReduce |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Alexander Komyagin | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Storage Execution
|
||||
| Participants: | |||||
| Description |
|
For very big long running map-reduce jobs (say, running for days), it might be useful to have ability to detach the operation from the shell/driver, and have some other means of indication that the job has completed or failed. |
| Comments |
| Comment by Geert Bosch [ 01/Feb/16 ] |
|
This request is not something specific to map-reduce jobs. Other operations, such as aggregations or even multi-updates or regular queries, may take a long time as well and have the same issue. The drawback of allowing detached operation is that it will be more likely that unattended jobs will run forever. This would be an issue in particular with commands using server-side JavaScript. Furthermore, it should be expected that commands will have a higher chance of failure if they run longer: an election in a replicaset or some administration commands will also terminate any running commands. It may be better to execute commands with very long run time to run in steps, with each step saving intermediate results. |
| Comment by Charlie Page [ 28/Apr/14 ] |
|
Can we expand this to anything that uses $out or some other way of returning what essentially are async results. |
| Comment by Alexander Komyagin [ 28/Apr/14 ] |
|
I imagine that if a client goes down, MR job will continue executing. However there will be no way to track the status of the job. |