Handle Tasks that may Fail
Question
How do I design steps to handle potential task failures at runtime?
Solution
Metaflow has two decorators that address this.
1Using @retry and @catch
You can use Metaflow's @retry
decorator before step definitions. The @retry
decorator takes an argument called times
which takes a number in [0,4]. This is intended to handle transient failures and is particularly useful when running tasks on the cloud where machine failures are more common.
You can also use this in the command line like python flow.py run --with retry
. By default this will retry failed steps with no @retry
decorator defined three times.
Similarly, the @catch
decorator will catch exceptions raised in the task. However @catch
is intended for use cases where you want to continue the flow after any exception. Catch contains an optional argument var
which you can save as a flow artifact if you want to later access the exception.
when using @catch
you should design the steps in your flow after the @catch
to tolerate exceptions in that step.
2Run Flow
This flow shows how to:
- Create a
foreach
branch instart
that creates threedivide
tasks. - Using
@retry
to rerundivide
when the step code produces an exception. - Saving the exception using
@catch
.- In the
join
task, use the saved exception to only store results if thedivide
parent task succeeded.
- In the
from metaflow import FlowSpec, step, retry, catch
class CatchRetryFlow(FlowSpec):
@step
def start(self):
self.divisors = [0, 1, 2]
self.next(self.divide, foreach='divisors')
@catch(var='divide_fail')
@retry(times=1)
@step
def divide(self):
self.res = 10 / self.input
self.next(self.join)
@step
def join(self, inputs):
self.results = [i.res
for i in inputs
if not i.divide_fail]
print('results', self.results)
self.next(self.end)
@step
def end(self):
print('done!')
if __name__ == '__main__':
CatchRetryFlow()
python handle_failed_task.py run
Further Reading
- Debugging flows with
resume
- Dealing with failures in Metaflow
- More examples in the Effective Data Science Infrastructure book