Loading...
Loading...
Builds and deploys data processing and ML training pipelines using TrueFoundry Workflows (built on Flyte). Use when creating DAGs, orchestrating multi-step tasks, scheduling ETL pipelines, or running ML training workflows.
npx skill4agent add truefoundry/tfy-deploy-skills truefoundry-workflows<objective>Routing note: For ambiguous user intents, use the shared clarification templates in references/intent-clarification.md.
@task@workflow@task@workflowtfy deploy workflowtfy applydeploydeployapplicationsjobsllm-deploydeployTFY_BASE_URLTFY_API_KEY.envTFY_WORKSPACE_FQNpip install "truefoundry[workflow]"tfy login --host "$TFY_BASE_URL"references/prerequisites.mdtfy --version| CLI Output | Status | Action |
|---|---|---|
| Current | Use |
| Outdated | Upgrade: install a pinned version (e.g. |
| Command not found | Not installed | Install: |
| CLI unavailable (no pip/Python) | Fallback | Use REST API via |
@task@workflow@taskfrom truefoundry.workflow import (
PythonTaskConfig,
TaskPythonBuild,
conditional,
task,
workflow,
)
from truefoundry.deploy import Resources
# Define task configuration
cpu_task_config = PythonTaskConfig(
image=TaskPythonBuild(
python_version="3.9",
pip_packages=["truefoundry[workflow]"],
),
resources=Resources(cpu_request=0.5, memory_request=500),
)
@task(task_config=cpu_task_config)
def fetch_data(source: str) -> dict:
"""Fetch and return raw data."""
data = {"records": 1000, "source": source}
return data
@task(task_config=cpu_task_config)
def process_data(raw_data: dict) -> dict:
"""Clean and transform the data."""
processed = {
"records": raw_data["records"],
"source": raw_data["source"],
"status": "processed",
}
return processed
@task(task_config=cpu_task_config)
def train_model(data: dict) -> str:
"""Train a model on processed data."""
return f"model_trained_on_{data['records']}_records"
@workflow
def ml_pipeline(source: str = "default") -> str:
"""End-to-end ML pipeline."""
raw = fetch_data(source=source)
processed = process_data(raw_data=raw)
result = train_model(data=processed)
return resultPythonTaskConfigfrom truefoundry.workflow import PythonTaskConfig, TaskPythonBuild
from truefoundry.deploy import Resources
# CPU task
cpu_task_config = PythonTaskConfig(
image=TaskPythonBuild(
python_version="3.9",
pip_packages=[
"truefoundry[workflow]",
"pandas==2.1.0",
"numpy",
],
# Or use a requirements file:
# requirements_path="requirements.txt",
),
resources=Resources(
cpu_request=0.5,
memory_request=500,
),
)
# GPU task (for training or inference steps)
gpu_task_config = PythonTaskConfig(
image=TaskPythonBuild(
python_version="3.9",
pip_packages=[
"truefoundry[workflow]",
"torch",
"transformers",
],
),
resources=Resources(
cpu_request=2.0,
cpu_limit=4.0,
memory_request=8192,
memory_limit=16384,
devices=[
GPUDevice(name="T4", count=1),
],
),
)truefoundry[workflow]pip_packagespip_packagesrequirements_pathmemory_requestmemory_limitfrom truefoundry.workflow import task, ContainerTask
container_task = ContainerTask(
name="my-container-task",
image="my-registry/my-image:latest",
command=["python", "run.py"],
resources=Resources(
cpu_request=1.0,
memory_request=2048,
),
)Security: Verify container image sources before using them in workflow tasks. Pin image tags to specific versions — do not use. For:latest, pin package versions to avoid supply-chain risks from unvetted upstream changes.pip_packages
from truefoundry.workflow import workflow, ExecutionConfig
@workflow(
execution_configs=[
ExecutionConfig(schedule="0 6 * * *"), # Every day at 6:00 AM UTC
]
)
def daily_etl_pipeline() -> str:
raw = fetch_data(source="production_db")
processed = process_data(raw_data=raw)
return processed["status"]| Schedule | Cron Expression | Description |
|---|---|---|
| Every 10 minutes | | Frequent data sync |
| Every hour | | Hourly aggregation |
| Daily at midnight UTC | | Nightly batch jobs |
| Daily at 6 AM UTC | | Morning data refresh |
| Every Monday at 9 AM UTC | | Weekly reports |
| First of month at midnight | | Monthly processing |
minute hour day-of-month month day-of-week@task@workflowtfy deploy workflowtfy deploy workflowtfy deploy workflow \
--name my-ml-pipeline \
--file workflow.py \
--workspace_fqn "$TFY_WORKSPACE_FQN"tfy apply# workflow-manifest.yaml
name: my-ml-pipeline
type: workflow
workflow_file: workflow.py
workspace_fqn: "YOUR_WORKSPACE_FQN"tfy apply -f workflow-manifest.yaml --dry-run --show-difftfy apply -f workflow-manifest.yamlfrom truefoundry.workflow import WorkflowDeployment
deployment = WorkflowDeployment(
name="my-ml-pipeline",
workflow_file="workflow.py",
workspace_fqn="your-workspace-fqn",
)
deployment.deploy()TFY_WORKSPACE_FQNtruefoundry[workflow]@workflow@taskapplicationstfy_applications_list(filters={"application_type": "workflow"})jobs| State | Meaning |
|---|---|
| QUEUED | Run is waiting to be scheduled |
| RUNNING | Tasks are actively executing |
| SUCCEEDED | All tasks completed successfully |
| FAILED | One or more tasks failed |
| TIMED_OUT | Run exceeded its timeout |
| ABORTED | Run was manually cancelled |
tfy deploy workflowtfy applyapplicationsstatusworkspacesapplicationsapplication_type: "workflow"jobslogssecretstfy: command not found
Install the TrueFoundry CLI:
pip install 'truefoundry==0.5.0'
tfy login --host "$TFY_BASE_URL"Manifest validation failed.
Check:
- YAML syntax is valid
- Required fields: name, type, workflow_file, workspace_fqn
- Workflow file path is correct and accessibleThe truefoundry[workflow] package is required for defining tasks and workflows:
pip install "truefoundry[workflow]"TFY_WORKSPACE_FQN is required. Get it from:
- TrueFoundry dashboard -> Workspaces
- Or use the `workspaces` skill to list available workspaces
Do not auto-pick a workspace.Flyte data plane is not installed on this cluster.
The control plane ships with TrueFoundry, but the data plane must be installed
on each cluster separately. Contact your platform admin to set up the Flyte
data plane components on the target cluster.Error: Workflow function contains non-task code.
The @workflow function must only contain task calls and control flow.
Move all computation into @task-decorated functions.
Bad:
@workflow
def my_wf():
data = pd.read_csv("file.csv") # NOT allowed in workflow function
return process(data)
Good:
@workflow
def my_wf():
data = load_data() # Call a @task instead
return process(data)Each task's PythonTaskConfig must include "truefoundry[workflow]" in pip_packages.
Without it, the task container cannot communicate with the Flyte backend.
Fix: Add "truefoundry[workflow]" to the pip_packages list in every PythonTaskConfig.Task failed due to resource limits (OOMKilled or CPU throttled).
Increase memory_limit or cpu_limit in the task's Resources config.
Check the task logs in the TrueFoundry dashboard for details.Cron schedules use UTC timezone. Verify your cron expression accounts for
UTC offset from your local timezone.
Use https://crontab.guru to validate your cron expression.401 Unauthorized — Check TFY_API_KEY is valid
404 Not Found — Check TFY_BASE_URL and API endpoint path
422 Validation Error — Check manifest fields match expected schema