Project utilities API
Start with Project Assets basics before diving into the API reference.
ProjectFlow
Base class for all project flows. Inherit from ProjectFlow instead of FlowSpec:
from obproject import ProjectFlow
class MyFlow(ProjectFlow):
@step
def start(self):
self.prj.register_data("dataset", "data")
self.next(self.end)
Configuration
ProjectFlow reads configuration from two files:
obproject.toml - Project identity and settings:
project = "fraud-detection"
[dev-assets]
branch = "main" # Read assets from main branch during local dev
[dependencies]
include_pyproject_toml = true # Auto-apply pyproject.toml deps (default: true)
| Section | Key | Default | Description |
|---|---|---|---|
[dev-assets] | branch | - | Branch to read assets from during local development |
[dependencies] | include_pyproject_toml | true | Auto-apply @pypi_base from pyproject.toml |
pyproject.toml - Python dependencies applied via @pypi_base:
[project]
dependencies = [
"scikit-learn>=1.3.0",
"pandas>=2.0.0",
]
prj Property
self.prj returns a ProjectContext with access to all project utilities. Initialized lazily on first access.
Attributes:
prj.project- Project name from configprj.branch- Current write branch (from Metaflow@project)prj.read_branch- Branch for reading assets (may differ during local dev)prj.write_branch- Branch for writing assetsprj.asset- Low-level Asset clientprj.evals- Evaluation logger
Asset Registration
prj.register_data()
prj.register_data(name, artifact, annotations=None, tags=None, description=None)
Register a Metaflow artifact as a data asset.
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name (e.g., "user_transactions") |
artifact | str | Artifact name (must exist as self.<artifact>) |
annotations | dict | Metadata key-value pairs (values converted to strings) |
tags | dict | Tags for categorization |
description | str | Human-readable description |
self.features = compute_features(data)
self.prj.register_data("fraud_features", "features",
annotations={"n_samples": len(self.features)})
prj.register_external_data()
prj.register_external_data(name, blobs, kind, annotations=None, tags=None, description=None)
Register external data (S3, databases, etc.) as a data asset.
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name |
blobs | list | URIs/references (e.g., ["s3://bucket/file.csv"]) |
kind | str | Data type (e.g., "s3", "database") |
annotations | dict | Metadata |
tags | dict | Tags |
description | str | Description |
self.prj.register_external_data("raw_logs",
blobs=["s3://data-lake/logs/2025-01-01/"],
kind="s3",
annotations={"size_gb": 450})
prj.register_model()
prj.register_model(name, artifact, annotations=None, tags=None, description=None)
Register a Metaflow artifact as a model asset.
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name (e.g., "fraud_classifier") |
artifact | str | Artifact name containing the model |
annotations | dict | Model metadata (accuracy, hyperparameters, etc.) |
tags | dict | Tags (framework, algorithm, etc.) |
description | str | Description |
self.model = RandomForestClassifier().fit(X, y)
self.prj.register_model("fraud_classifier", "model",
annotations={"accuracy": 0.95, "algorithm": "RandomForest"})
prj.register_external_model()
prj.register_external_model(name, blobs, kind, annotations=None, tags=None, description=None)
Register an external model (HuggingFace, checkpoints, etc.) as a model asset.
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name |
blobs | list | URIs/references |
kind | str | Model type (e.g., "checkpoint", "huggingface") |
annotations | dict | Metadata |
tags | dict | Tags |
description | str | Description |
self.prj.register_external_model("base_llm",
blobs=["meta-llama/Llama-3.1-8B-Instruct"],
kind="huggingface",
annotations={"context_length": 8192})
Asset Consumption
prj.get_data()
prj.get_data(name, instance="latest")
Retrieve artifact data from a data asset registered with register_data().
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name |
instance | str | Version: "latest", "latest-N", or "vN" |
Returns: The artifact data
features = self.prj.get_data("fraud_features")
previous = self.prj.get_data("fraud_features", instance="latest-1")
Only works for artifact-based assets. For external data, use prj.asset.consume_data_asset().
prj.get_model()
prj.get_model(name, instance="latest")
Retrieve artifact data from a model asset registered with register_model().
| Parameter | Type | Description |
|---|---|---|
name | str | Asset name |
instance | str | Version: "latest", "latest-N", or "vN" |
Returns: The model artifact data
model = self.prj.get_model("fraud_classifier")
previous_model = self.prj.get_model("fraud_classifier", instance="latest-1")
Only works for artifact-based models. For external models (checkpoints, HuggingFace, etc.), use prj.asset.consume_model_asset() and load from the returned blobs.
prj.asset.consume_data_asset()
prj.asset.consume_data_asset(name, instance="latest")
Low-level method returning the full asset reference.
Returns: Asset reference dict:
{
"id": "v123",
"created_by": {"entity_id": "FlowName/run_id/step/task"},
"data_properties": {
"data_kind": "artifact",
"annotations": {"key": "value"},
"blobs": []
}
}
prj.asset.consume_model_asset()
prj.asset.consume_model_asset(name, instance="latest")
Low-level method for consuming model assets.
Returns: Asset reference dict with model_properties instead of data_properties.
ref = self.prj.asset.consume_model_asset("fraud_classifier")
accuracy = float(ref["model_properties"]["annotations"]["accuracy"])
prj.asset.list_data_assets()
prj.asset.list_data_assets(tags=None)
List data assets in current project/branch.
| Parameter | Type | Description |
|---|---|---|
tags | dict | Filter by tags (client-side filtering) |
Returns: {"data": [...]}
prj.asset.list_model_assets()
prj.asset.list_model_assets(tags=None)
List model assets in current project/branch.
Returns: {"models": [...]}
Tag filtering is client-side only. All assets are fetched, then filtered locally.
Standalone Asset Usage
Use Asset directly outside flow context (deployments, notebooks, scripts):
from obproject.assets import Asset
asset = Asset(
project="fraud-detection",
branch="main",
read_only=True # Required outside flow context
)
ref = asset.consume_model_asset("fraud_classifier")
| Parameter | Type | Description |
|---|---|---|
project | str | Project name |
branch | str | Branch name |
read_only | bool | Set True outside flows (skips entity tracking) |
When read_only=True:
- Registration methods are no-ops
- Consume methods use GET (no lineage tracking) instead of PUT
Event Publishing
prj.publish_event()
prj.publish_event(name, payload=None)
Publish an event to trigger other flows.
| Parameter | Type | Description |
|---|---|---|
name | str | Event name (e.g., "retrain_model") |
payload | dict | JSON-serializable payload |
Events are namespaced as prj.{project}.{branch}.{name}.
self.prj.publish_event("model_trained", payload={"accuracy": 0.95})
prj.safe_publish_event()
prj.safe_publish_event(name, payload=None)
Same as publish_event() but failures don't raise exceptions.
@project_trigger
Subscribe a flow to project events:
from obproject import ProjectFlow, project_trigger
@project_trigger(event="model_trained")
class EvaluationFlow(ProjectFlow):
@step
def start(self):
# Triggered when "model_trained" event is published
self.next(self.end)
The decorator resolves the full event name from project config.
Evaluation Logging
prj.evals.log()
prj.evals.log(message)
Log structured evaluation data with project/branch/run metadata.
| Parameter | Type | Description |
|---|---|---|
message | dict or str | Evaluation data |
self.prj.evals.log({
"model": "fraud_classifier",
"accuracy": 0.95,
"test_samples": 1000
})
Output includes a magic prefix for monitoring system ingestion.
outerbounds flowproject
The outerbounds flowproject subcommands manage deployed project resources — workflow templates, assets, apps, and metadata. These are the same primitives that obproject-deploy creates during CI/CD.
These commands require a configured Metaflow profile with access to the Outerbounds API. They read credentials from your ~/.metaflowconfig directory.
Common options
All outerbounds flowproject subcommands accept:
| Option | Default | Description |
|---|---|---|
-d, --config-dir | ~/.metaflowconfig | Path to Metaflow configuration directory |
-p, --profile | $METAFLOW_PROFILE | Named Metaflow profile to use |
Identifying a project branch
Several commands require --id in the format project/branch:
outerbounds flowproject list-templates --id my_project/main
outerbounds flowproject teardown-branch --id my_project/feature-v2
Branch names are normalized to match how obproject-deploy stores them: - and / characters are replaced with _, and the result is lowercased. So --id my_project/feature-v2 resolves to branch feature_v2.
get-metadata
Fetch the latest flowproject metadata for a project/branch.
outerbounds flowproject get-metadata --id <project/branch>
Returns the JSON metadata document that obproject-deploy registered, including workflow definitions, asset references, and app configurations.
# View metadata for production branch
outerbounds flowproject get-metadata --id fraud_detection/main
# Pretty-print with jq
outerbounds flowproject get-metadata --id fraud_detection/main | jq .
set-metadata
Register or update flowproject metadata for a project/branch.
outerbounds flowproject set-metadata '<json_string>'
| Argument | Description |
|---|---|
json_str | JSON string containing the flowproject metadata payload |
outerbounds flowproject set-metadata '{"project": "fraud_detection", "branch": "main", "workflows": [...]}'
This is a low-level command used by deployment tooling. Prefer obproject-deploy for standard deployments.
list-templates
List Argo workflow templates deployed for a project/branch.
outerbounds flowproject list-templates --id <project/branch> [-o json]
| Option | Description |
|---|---|
--id | Required. project/branch identifier |
-o, --output | Output format: json or human-readable (default) |
Templates are discovered by querying Argo directly and matching on metaflow/project_name and metaflow/branch_name annotations.
# Human-readable output
outerbounds flowproject list-templates --id fraud_detection/main
# Machine-readable
outerbounds flowproject list-templates --id fraud_detection/main -o json
# → {"templates": ["frauddetection.prod.trainflow", "frauddetection.prod.scoreflow"]}
delete-metadata
Delete all flowproject metadata for a project/branch.
outerbounds flowproject delete-metadata --id <project/branch> [--yes]
| Option | Description |
|---|---|
--id | Required. project/branch identifier |
--yes | Skip confirmation prompt |
-o, --output | Output format: json or human-readable (default) |
outerbounds flowproject delete-metadata --id fraud_detection/feature-v2 --yes
This removes the metadata record only. It does not delete workflow templates, assets, or apps. Use teardown-branch to remove all resources.
teardown-branch
Delete all deployed resources for a project/branch in a single operation.
outerbounds flowproject teardown-branch --id <project/branch> [--dry-run] [--yes] [-o json]
| Option | Description |
|---|---|
--id | Required. project/branch identifier |
--dry-run | Discover and list resources without deleting anything |
--yes | Skip confirmation prompt |
-o, --output | Output format: json or human-readable (default) |
Teardown discovers and deletes these resource types in order:
- Workflow templates — Argo templates matching the project/branch annotations. Deleting a template cascades to its associated CronWorkflows and Sensors.
- Data assets — As listed in the flowproject metadata.
- Model assets — As listed in the flowproject metadata.
- Apps — Capsules tagged with the project and branch.
- Flowproject metadata — The metadata record itself.
# Preview what would be deleted
outerbounds flowproject teardown-branch --id fraud_detection/feature-v2 --dry-run
# Execute teardown
outerbounds flowproject teardown-branch --id fraud_detection/feature-v2 --yes
# JSON output for scripting
outerbounds flowproject teardown-branch --id fraud_detection/feature-v2 --yes -o json
See Also
- Asset branch resolution - How read/write branches are determined across deployment contexts
- Project Assets basics - Introduction to assets
- Project Structure - Project file organization
- CI/CD integration - Setting up
obproject-deploywith GitHub Actions, GitLab, and more - Project lifecycle - Understanding what deploy creates and how to tear it down