deploying-airflow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDeploying Airflow
部署Airflow
This skill covers deploying Airflow DAGs and projects to production, whether using Astro (Astronomer's managed platform) or open-source Airflow on Docker Compose or Kubernetes.
Choosing a path: Astro is a good fit for managed operations and faster CI/CD. For open-source, use Docker Compose for dev and the Helm chart for production.
本技能涵盖将Airflow DAG与项目部署到生产环境的内容,包括使用Astro(Astronomer的托管平台),或是在Docker Compose或Kubernetes上部署开源Airflow。
部署路径选择:Astro适合托管式运维与快速CI/CD场景。对于开源版本,开发环境推荐使用Docker Compose,生产环境则推荐使用Helm Chart部署在Kubernetes上。
Astro (Astronomer)
Astro(Astronomer)
Astro provides CLI commands and GitHub integration for deploying Airflow projects.
Astro提供CLI命令与GitHub集成功能,用于部署Airflow项目。
Deploy Commands
部署命令
| Command | What It Does |
|---|---|
| Full project deploy — builds Docker image and deploys DAGs |
| DAG-only deploy — pushes only DAG files (fast, no image build) |
| Image-only deploy — pushes only the Docker image (for multi-repo CI/CD) |
| dbt project deploy — deploys a dbt project to run alongside Airflow |
| 命令 | 功能说明 |
|---|---|
| 完整项目部署——构建Docker镜像并部署DAG |
| 仅DAG部署——仅推送DAG文件(速度快,无需构建镜像) |
| 仅镜像部署——仅推送Docker镜像(适用于多仓库CI/CD场景) |
| dbt项目部署——将dbt项目部署至Airflow并协同运行 |
Full Project Deploy
完整项目部署
Builds a Docker image from your Astro project and deploys everything (DAGs, plugins, requirements, packages):
bash
astro deployUse this when you've changed , , , plugins, or any non-DAG file.
requirements.txtDockerfilepackages.txt基于Astro项目构建Docker镜像,并部署所有内容(DAG、插件、依赖包、配置文件):
bash
astro deploy当你修改了、、、插件或任何非DAG文件时,使用该命令。
requirements.txtDockerfilepackages.txtDAG-Only Deploy
仅DAG部署
Pushes only files in the directory without rebuilding the Docker image:
dags/bash
astro deploy --dagsThis is significantly faster than a full deploy since it skips the image build. Use this when you've only changed DAG files and haven't modified dependencies or configuration.
仅推送目录下的文件,无需重新构建Docker镜像:
dags/bash
astro deploy --dags由于跳过了镜像构建步骤,该命令比完整部署快得多。仅当你只修改了DAG文件,且未更改依赖或配置时使用。
Image-Only Deploy
仅镜像部署
Pushes only the Docker image without updating DAGs:
bash
astro deploy --imageThis is useful in multi-repo setups where DAGs are deployed separately from the image, or in CI/CD pipelines that manage image and DAG deploys independently.
仅推送Docker镜像,不更新DAG:
bash
astro deploy --image这在多仓库架构中非常实用,比如DAG与镜像分开部署,或者在CI/CD流水线中独立管理镜像与DAG的发布周期。
dbt Project Deploy
dbt项目部署
Deploys a dbt project to run with Cosmos on an Astro deployment:
bash
astro deploy --dbt将dbt项目部署至Astro环境,与Cosmos协同运行:
bash
astro deploy --dbtGitHub Integration
GitHub集成
Astro supports branch-to-deployment mapping for automated deploys:
- Map branches to specific deployments (e.g., -> production,
main-> staging)develop - Pushes to mapped branches trigger automatic deploys
- Supports DAG-only deploys on merge for faster iteration
Configure this in the Astro UI under Deployment Settings > CI/CD.
Astro支持分支与部署环境的映射,实现自动化部署:
- 将分支映射到特定部署环境(例如分支对应生产环境,
main分支对应预发布环境)develop - 推送代码至映射分支时自动触发部署
- 支持合并后仅部署DAG,加速迭代效率
在Astro UI的部署设置 > CI/CD中配置该功能。
CI/CD Patterns
CI/CD模式
Common CI/CD strategies on Astro:
- DAG-only on feature branches: Use for fast iteration during development
astro deploy --dags - Full deploy on main: Use on merge to main for production releases
astro deploy - Separate image and DAG pipelines: Use and
--imagein separate CI jobs for independent release cycles--dags
Astro上常见的CI/CD策略:
- 功能分支仅部署DAG:开发阶段使用实现快速迭代
astro deploy --dags - 主分支完整部署:合并到main分支时使用进行生产发布
astro deploy - 镜像与DAG流水线分离:在独立的CI任务中分别使用和
--image命令,实现独立发布周期--dags
Deploy Queue
部署队列
When multiple deploys are triggered in quick succession, Astro processes them sequentially in a deploy queue. Each deploy completes before the next one starts.
当短时间内触发多个部署任务时,Astro会按照顺序依次处理,完成一个部署后再开始下一个。
Reference
参考文档
Open-Source: Docker Compose
开源版本:Docker Compose
Deploy Airflow using the official Docker Compose setup. This is recommended for learning and exploration — for production, use Kubernetes with the Helm chart (see below).
使用官方Docker Compose部署Airflow。该方式推荐用于学习与探索——生产环境请使用Kubernetes + Helm Chart(见下文)。
Prerequisites
前置条件
- Docker and Docker Compose v2.14.0+
- The official Docker image
apache/airflow
- Docker与Docker Compose v2.14.0+
- 官方Docker镜像
apache/airflow
Quick Start
快速开始
Download the official Airflow 3 Docker Compose file:
bash
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml'This sets up the full Airflow 3 architecture:
| Service | Purpose |
|---|---|
| REST API and UI (port 8080) |
| Schedules DAG runs |
| Parses and processes DAG files |
| Executes tasks (CeleryExecutor) |
| Handles deferrable/async tasks |
| Metadata database |
| Celery message broker |
下载官方Airflow 3 Docker Compose文件:
bash
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml'该文件会搭建完整的Airflow 3架构:
| 服务 | 用途 |
|---|---|
| REST API与UI(端口8080) |
| 调度DAG运行 |
| 解析与处理DAG文件 |
| 执行任务(CeleryExecutor) |
| 处理可延迟/异步任务 |
| 元数据库 |
| Celery消息代理 |
Minimal Setup
极简配置
For a simpler setup with LocalExecutor (no Celery/Redis), create a :
docker-compose.yamlyaml
x-airflow-common: &airflow-common
image: apache/airflow:3 # Use the latest Airflow 3.x release
environment: &airflow-common-env
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/dags
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
postgres:
condition: service_healthy
services:
postgres:
image: postgres:16
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 10s
retries: 5
start_period: 5s
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
command:
- -c
- |
airflow db migrate
airflow users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email admin@example.com \
--password admin
depends_on:
postgres:
condition: service_healthy
airflow-apiserver:
<<: *airflow-common
command: airflow api-server
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
airflow-scheduler:
<<: *airflow-common
command: airflow scheduler
airflow-dag-processor:
<<: *airflow-common
command: airflow dag-processor
airflow-triggerer:
<<: *airflow-common
command: airflow triggerer
volumes:
postgres-db-volume:Airflow 3 architecture note: The webserver has been replaced by the API server (), and the DAG processor now runs as a standalone process separate from the scheduler.airflow api-server
如需使用LocalExecutor(无需Celery/Redis)的简化配置,创建文件:
docker-compose.yamlyaml
x-airflow-common: &airflow-common
image: apache/airflow:3 # Use the latest Airflow 3.x release
environment: &airflow-common-env
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/dags
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
postgres:
condition: service_healthy
services:
postgres:
image: postgres:16
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 10s
retries: 5
start_period: 5s
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
command:
- -c
- |
airflow db migrate
airflow users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email admin@example.com \
--password admin
depends_on:
postgres:
condition: service_healthy
airflow-apiserver:
<<: *airflow-common
command: airflow api-server
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
airflow-scheduler:
<<: *airflow-common
command: airflow scheduler
airflow-dag-processor:
<<: *airflow-common
command: airflow dag-processor
airflow-triggerer:
<<: *airflow-common
command: airflow triggerer
volumes:
postgres-db-volume:Airflow 3架构说明:Webserver已被API服务器()取代,DAG处理器现在作为独立进程运行,与调度器分离。airflow api-server
Common Operations
常见操作
bash
undefinedbash
undefinedStart all services
启动所有服务
docker compose up -d
docker compose up -d
Stop all services
停止所有服务
docker compose down
docker compose down
View logs
查看调度器日志
docker compose logs -f airflow-scheduler
docker compose logs -f airflow-scheduler
Restart after requirements change
修改依赖后重启服务
docker compose down && docker compose up -d --build
docker compose down && docker compose up -d --build
Run a one-off Airflow CLI command
执行单次Airflow CLI命令
docker compose exec airflow-apiserver airflow dags list
undefineddocker compose exec airflow-apiserver airflow dags list
undefinedInstalling Python Packages
安装Python包
Add packages to and rebuild:
requirements.txtbash
undefined将包添加到后重新构建:
requirements.txtbash
undefinedAdd to requirements.txt, then:
编辑requirements.txt后执行:
docker compose down
docker compose up -d --build
Or use a custom Dockerfile:
```dockerfile
FROM apache/airflow:3 # Pin to a specific version (e.g., 3.1.7) for reproducibility
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txtUpdate to build from the Dockerfile:
docker-compose.yamlyaml
x-airflow-common: &airflow-common
build:
context: .
dockerfile: Dockerfile
# ... rest of configdocker compose down
docker compose up -d --build
或使用自定义Dockerfile:
```dockerfile
FROM apache/airflow:3 # Pin to a specific version (e.g., 3.1.7) for reproducibility
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt更新以使用该Dockerfile构建镜像:
docker-compose.yamlyaml
x-airflow-common: &airflow-common
build:
context: .
dockerfile: Dockerfile
# ... rest of configEnvironment Variables
环境变量
Configure Airflow settings via environment variables in :
docker-compose.yamlyaml
environment:
# Core settings
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__PARALLELISM: 32
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 16
# Email
AIRFLOW__EMAIL__EMAIL_BACKEND: airflow.utils.email.send_email_smtp
AIRFLOW__SMTP__SMTP_HOST: smtp.example.com
# Connections (as URI)
AIRFLOW_CONN_MY_DB: postgresql://user:pass@host:5432/db在中通过环境变量配置Airflow:
docker-compose.yamlyaml
environment:
# 核心设置
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__PARALLELISM: 32
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 16
# 邮件设置
AIRFLOW__EMAIL__EMAIL_BACKEND: airflow.utils.email.send_email_smtp
AIRFLOW__SMTP__SMTP_HOST: smtp.example.com
# 连接配置(URI格式)
AIRFLOW_CONN_MY_DB: postgresql://user:pass@host:5432/dbOpen-Source: Kubernetes (Helm Chart)
开源版本:Kubernetes(Helm Chart)
Deploy Airflow on Kubernetes using the official Apache Airflow Helm chart.
使用官方Apache Airflow Helm Chart在Kubernetes上部署Airflow。
Prerequisites
前置条件
- A Kubernetes cluster
- configured
kubectl - installed
helm
- Kubernetes集群
- 已配置
kubectl - 已安装
helm
Installation
安装步骤
bash
undefinedbash
undefinedAdd the Airflow Helm repo
添加Airflow Helm仓库
helm repo add apache-airflow https://airflow.apache.org
helm repo update
helm repo add apache-airflow https://airflow.apache.org
helm repo update
Install with default values
使用默认值安装
helm install airflow apache-airflow/airflow
--namespace airflow
--create-namespace
--namespace airflow
--create-namespace
helm install airflow apache-airflow/airflow
--namespace airflow
--create-namespace
--namespace airflow
--create-namespace
Install with custom values
使用自定义配置安装
helm install airflow apache-airflow/airflow
--namespace airflow
--create-namespace
-f values.yaml
--namespace airflow
--create-namespace
-f values.yaml
undefinedhelm install airflow apache-airflow/airflow
--namespace airflow
--create-namespace
-f values.yaml
--namespace airflow
--create-namespace
-f values.yaml
undefinedKey values.yaml Configuration
关键values.yaml配置
yaml
undefinedyaml
undefinedExecutor type
执行器类型
executor: KubernetesExecutor # or CeleryExecutor, LocalExecutor
executor: KubernetesExecutor # 或CeleryExecutor、LocalExecutor
Airflow image (pin to your desired version)
Airflow镜像(固定到指定版本以保证可复现性)
defaultAirflowRepository: apache/airflow
defaultAirflowTag: "3" # Or pin: "3.1.7"
defaultAirflowRepository: apache/airflow
defaultAirflowTag: "3" # 或固定版本:"3.1.7"
Git-sync for DAGs (recommended for production)
Git-sync同步DAG(生产环境推荐)
dags:
gitSync:
enabled: true
repo: https://github.com/your-org/your-dags.git
branch: main
subPath: dags
wait: 60 # seconds between syncs
dags:
gitSync:
enabled: true
repo: https://github.com/your-org/your-dags.git
branch: main
subPath: dags
wait: 60 # 同步间隔(秒)
API server (replaces webserver in Airflow 3)
API服务器(Airflow 3中替代webserver)
apiServer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
replicas: 1
apiServer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
replicas: 1
Scheduler
调度器
scheduler:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1000m"
memory: "2Gi"
scheduler:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1000m"
memory: "2Gi"
Standalone DAG processor
独立DAG处理器
dagProcessor:
enabled: true
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
dagProcessor:
enabled: true
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
Triggerer (for deferrable tasks)
Triggerer(处理可延迟任务)
triggerer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
triggerer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
Worker resources (CeleryExecutor only)
Worker资源配置(仅CeleryExecutor)
workers:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
replicas: 2
workers:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
replicas: 2
Log persistence
日志持久化
logs:
persistence:
enabled: true
size: 10Gi
logs:
persistence:
enabled: true
size: 10Gi
PostgreSQL (built-in)
内置PostgreSQL
postgresql:
enabled: true
postgresql:
enabled: true
Or use an external database
或使用外部数据库
postgresql:
postgresql:
enabled: false
enabled: false
data:
data:
metadataConnection:
metadataConnection:
user: airflow
user: airflow
pass: airflow
pass: airflow
host: your-rds-host.amazonaws.com
host: your-rds-host.amazonaws.com
port: 5432
port: 5432
db: airflow
db: airflow
undefinedundefinedUpgrading
升级操作
bash
undefinedbash
undefinedUpgrade with new values
使用新配置升级
helm upgrade airflow apache-airflow/airflow
--namespace airflow
-f values.yaml
--namespace airflow
-f values.yaml
helm upgrade airflow apache-airflow/airflow
--namespace airflow
-f values.yaml
--namespace airflow
-f values.yaml
Upgrade to a new Airflow version
升级到新的Airflow版本
helm upgrade airflow apache-airflow/airflow
--namespace airflow
--set defaultAirflowTag="<version>"
--namespace airflow
--set defaultAirflowTag="<version>"
undefinedhelm upgrade airflow apache-airflow/airflow
--namespace airflow
--set defaultAirflowTag="<version>"
--namespace airflow
--set defaultAirflowTag="<version>"
undefinedDAG Deployment Strategies on Kubernetes
Kubernetes上的DAG部署策略
- Git-sync (recommended): DAGs are synced from a Git repository automatically
- Persistent Volume: Mount a shared PV containing DAGs
- Baked into image: Include DAGs in a custom Docker image
- **Git-sync(推荐)**DAG自动从Git仓库同步
- 持久化卷:挂载包含DAG的共享PV
- 内置到镜像:将DAG包含在自定义Docker镜像中
Useful Commands
常用命令
bash
undefinedbash
undefinedCheck pod status
查看Pod状态
kubectl get pods -n airflow
kubectl get pods -n airflow
View scheduler logs
查看调度器日志
kubectl logs -f deployment/airflow-scheduler -n airflow
kubectl logs -f deployment/airflow-scheduler -n airflow
Port-forward the API server
端口转发API服务器
kubectl port-forward svc/airflow-apiserver 8080:8080 -n airflow
kubectl port-forward svc/airflow-apiserver 8080:8080 -n airflow
Run a one-off CLI command
执行单次CLI命令
kubectl exec -it deployment/airflow-scheduler -n airflow -- airflow dags list
---kubectl exec -it deployment/airflow-scheduler -n airflow -- airflow dags list
---Related Skills
相关技能
- setting-up-astro-project: For initializing a new Astro project
- managing-astro-local-env: For local development with
astro dev - authoring-dags: For writing DAGs before deployment
- testing-dags: For testing DAGs before deployment
- setting-up-astro-project:初始化新的Astro项目
- managing-astro-local-env:使用进行本地开发
astro dev - authoring-dags:部署前编写DAG
- testing-dags:部署前测试DAG