[SERVER-13772] Detached map-reduce jobs Created: 28/Apr/14  Updated: 06/Dec/22  Resolved: 01/Feb/16

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Alexander Komyagin Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Storage Execution
Participants:

 Description   

For very big long running map-reduce jobs (say, running for days), it might be useful to have ability to detach the operation from the shell/driver, and have some other means of indication that the job has completed or failed.



 Comments   
Comment by Geert Bosch [ 01/Feb/16 ]

This request is not something specific to map-reduce jobs. Other operations, such as aggregations or even multi-updates or regular queries, may take a long time as well and have the same issue. The drawback of allowing detached operation is that it will be more likely that unattended jobs will run forever. This would be an issue in particular with commands using server-side JavaScript. Furthermore, it should be expected that commands will have a higher chance of failure if they run longer: an election in a replicaset or some administration commands will also terminate any running commands.

It may be better to execute commands with very long run time to run in steps, with each step saving intermediate results.

Comment by Charlie Page [ 28/Apr/14 ]

Can we expand this to anything that uses $out or some other way of returning what essentially are async results.

Comment by Alexander Komyagin [ 28/Apr/14 ]

I imagine that if a client goes down, MR job will continue executing. However there will be no way to track the status of the job.

Generated at Thu Feb 08 03:32:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.