Track Artifacts with CometML
Question
How can I track artifacts of my flows with Comet ML?
Solution
You can track flow artifacts using any Comet ML calls you already use because you can use any Python code in Metaflow steps. In addition, the Comet ML team developed an integration with Metaflow to make tracking artifacts produced in flow runs even more convenient.
The remainder of this page will walk through the following topics:
- What is Comet ML?
- How to write a flow using the Comet integration?
- How to run the flow that tracks experiments with Comet?
1What is Comet ML?
Comet ML is a platform to track, compare, explain, and optimize.
There is a comet_ml
Python library that allows you to read and write data about the configuration and results of your data science experiments.
After you sign up, you can use your Comet API key to create an Experiment
in Python code or with their APIs.
import comet_ml
import os
experiment = comet_ml.Experiment(
# read env var set like `export COMET_API_KEY=<>`
api_key=os.getenv('COMET_API_KEY'),
# read env var set like `export COMET_PROJECT_NAME=<>`
project_name=os.getenv('COMET_PROJECT_NAME')
)
Experiments are the core data structure Comet helps you organize information with. You can read more about Experiments here.
The rest of this page shows how to use Comet's Metaflow integration to automate the creation and reporting of data to Comet Experiment
objects.
2Write a Flow using the Comet Integration
The script shows how to:
- Login to Comet before running the script.
- The
init()
call in the main section of this script establishes a connection to Comet. This will try to read the value in theCOMET_API_KEY
environment variable if you have it set. You can read more about configuring Comet in a Python environment here.
- The
- Create a set of Comet
Experiment
objects to track both the individual tasks and the state of the flow as a whole. - Log parameters and metrics with Comet from the flow runtime.
- Observe the
train_model
step and notice thatself.comet_experiment
is accessible automatically because of the@comet_flow
decorator.
- Observe the
from comet_ml import init
from comet_ml.integration.metaflow import comet_flow
from metaflow import FlowSpec, JSONType, Parameter, card, step
@comet_flow(project_name="comet-metaflow")
class CometFlow(FlowSpec):
@step
def start(self):
import plotly.express as px
from sklearn.model_selection import train_test_split
self.input_df = px.data.tips()
self.X = self.input_df.total_bill.values[:, None]
self.X_train, self.X_test, \
self.Y_train, self.Y_test = train_test_split(
self.X, self.input_df.tip, random_state=42
)
self.next(self.train_model)
@step
def train_model(self):
import numpy as np
from sklearn import linear_model
from comet_ml import API
self.model = linear_model.LinearRegression()
self.model.fit(self.X_train, self.Y_train)
self.score = self.model.score(self.X_test, self.Y_test)
self.comet_experiment.log_parameter("model", self.model)
self.comet_experiment.log_metric("score", self.score)
self.next(self.end)
@step
def end(self):
pass
if __name__ == "__main__":
init()
CometFlow()
3Run the Flow
Now that you have configured Comet to track Experiments for this flow, you can run it from the command line in the normal Metaflow way.
python track_with_comet_integration.py run