modal-knowledge
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseModal Knowledge Skill
Modal平台知识技能
Comprehensive Modal.com platform knowledge covering all features, pricing, and best practices. Activate this skill when users need detailed information about Modal's serverless cloud platform.
涵盖所有功能、定价和最佳实践的Modal.com平台全面知识。当用户需要了解Modal无服务器云平台的详细信息时,激活此技能。
Activation Triggers
激活触发条件
Activate this skill when users ask about:
- Modal.com platform features and capabilities
- GPU-accelerated Python functions
- Serverless container configuration
- Modal pricing and billing
- Modal CLI commands
- Web endpoints and APIs on Modal
- Scheduled/cron jobs on Modal
- Modal volumes, secrets, and storage
- Parallel processing with Modal
- Modal deployment and CI/CD
当用户询问以下内容时激活此技能:
- Modal.com平台功能与能力
- GPU加速Python函数
- 无服务器容器配置
- Modal定价与计费
- Modal CLI命令
- Modal上的Web端点与API
- Modal上的定时/ cron任务
- Modal卷、密钥与存储
- 使用Modal进行并行处理
- Modal部署与CI/CD
Platform Overview
平台概述
Modal is a serverless cloud platform for running Python code, optimized for AI/ML workloads with:
- Zero Configuration: Everything defined in Python code
- Fast GPU Startup: ~1 second container spin-up
- Automatic Scaling: Scale to zero, scale to thousands
- Per-Second Billing: Only pay for active compute
- Multi-Cloud: AWS, GCP, Oracle Cloud Infrastructure
Modal是一个用于运行Python代码的无服务器云平台,针对AI/ML工作负载优化,具备以下特性:
- 零配置:所有配置均通过Python代码定义
- 快速GPU启动:约1秒完成容器启动
- 自动扩缩容:支持缩容至零,也可扩容至数千实例
- 按秒计费:仅为活跃计算资源付费
- 多云支持:AWS、GCP、Oracle Cloud Infrastructure
Core Components Reference
核心组件参考
Apps and Functions
应用与函数
python
import modal
app = modal.App("app-name")
@app.function()
def basic_function(arg: str) -> str:
return f"Result: {arg}"
@app.local_entrypoint()
def main():
result = basic_function.remote("test")
print(result)python
import modal
app = modal.App("app-name")
@app.function()
def basic_function(arg: str) -> str:
return f"Result: {arg}"
@app.local_entrypoint()
def main():
result = basic_function.remote("test")
print(result)Function Decorator Parameters
函数装饰器参数
| Parameter | Type | Description |
|---|---|---|
| Image | Container image configuration |
| str/list | GPU type(s): "T4", "A100", ["H100", "A100"] |
| float | CPU cores (0.125 to 64) |
| int | Memory in MB (128 to 262144) |
| int | Max execution seconds |
| int | Retry attempts on failure |
| list | Secrets to inject |
| dict | Volume mount points |
| Cron/Period | Scheduled execution |
| int | Max concurrent executions |
| int | Seconds to keep warm |
| bool | Auto-sync source code |
| 参数 | 类型 | 描述 |
|---|---|---|
| Image | 容器镜像配置 |
| str/list | GPU类型:"T4", "A100", ["H100", "A100"] |
| float | CPU核心数(0.125至64) |
| int | 内存大小(MB,128至262144) |
| int | 最大执行时长(秒) |
| int | 失败重试次数 |
| list | 注入的密钥 |
| dict | 卷挂载点 |
| Cron/Period | 定时执行配置 |
| int | 最大并发执行数 |
| int | 容器闲置保留时长(秒) |
| bool | 自动同步源代码 |
GPU Reference
GPU参考
Available GPUs
可用GPU
| GPU | Memory | Use Case | ~Cost/hr |
|---|---|---|---|
| T4 | 16 GB | Small inference | $0.59 |
| L4 | 24 GB | Medium inference | $0.80 |
| A10G | 24 GB | Inference/fine-tuning | $1.10 |
| L40S | 48 GB | Heavy inference | $1.50 |
| A100-40GB | 40 GB | Training | $2.00 |
| A100-80GB | 80 GB | Large models | $3.00 |
| H100 | 80 GB | Cutting-edge | $5.00 |
| H200 | 141 GB | Largest models | $5.00 |
| B200 | 180+ GB | Latest gen | $6.25 |
| GPU | 内存 | 适用场景 | 每小时约费用 |
|---|---|---|---|
| T4 | 16 GB | 小型推理任务 | $0.59 |
| L4 | 24 GB | 中型推理任务 | $0.80 |
| A10G | 24 GB | 推理/微调任务 | $1.10 |
| L40S | 48 GB | 重型推理任务 | $1.50 |
| A100-40GB | 40 GB | 训练任务 | $2.00 |
| A100-80GB | 80 GB | 大模型任务 | $3.00 |
| H100 | 80 GB | 前沿任务 | $5.00 |
| H200 | 141 GB | 超大模型任务 | $5.00 |
| B200 | 180+ GB | 最新一代模型任务 | $6.25 |
GPU Configuration
GPU配置
python
undefinedpython
undefinedSingle GPU
单GPU
@app.function(gpu="A100")
@app.function(gpu="A100")
Specific memory variant
指定内存版本
@app.function(gpu="A100-80GB")
@app.function(gpu="A100-80GB")
Multi-GPU
多GPU
@app.function(gpu="H100:4")
@app.function(gpu="H100:4")
Fallbacks (tries in order)
备选方案(按顺序尝试)
@app.function(gpu=["H100", "A100", "any"])
@app.function(gpu=["H100", "A100", "any"])
"any" = L4, A10G, or T4
"any" = L4、A10G或T4
@app.function(gpu="any")
---@app.function(gpu="any")
---Image Building
镜像构建
Base Images
基础镜像
python
undefinedpython
undefinedDebian slim (recommended)
Debian slim(推荐)
modal.Image.debian_slim(python_version="3.11")
modal.Image.debian_slim(python_version="3.11")
From Dockerfile
基于Dockerfile
modal.Image.from_dockerfile("./Dockerfile")
modal.Image.from_dockerfile("./Dockerfile")
From Docker registry
来自Docker镜像仓库
modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
undefinedmodal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
undefinedPackage Installation
包安装
python
undefinedpython
undefinedpip (standard)
pip(标准方式)
image.pip_install("torch", "transformers")
image.pip_install("torch", "transformers")
uv (FASTER - 10-100x)
uv(更快 - 10-100倍速度)
image.uv_pip_install("torch", "transformers")
image.uv_pip_install("torch", "transformers")
System packages
系统包
image.apt_install("ffmpeg", "libsm6")
image.apt_install("ffmpeg", "libsm6")
Shell commands
Shell命令
image.run_commands("apt-get update", "make install")
undefinedimage.run_commands("apt-get update", "make install")
undefinedAdding Files
添加文件
python
undefinedpython
undefinedSingle file
单个文件
image.add_local_file("./config.json", "/app/config.json")
image.add_local_file("./config.json", "/app/config.json")
Directory
目录
image.add_local_dir("./models", "/app/models")
image.add_local_dir("./models", "/app/models")
Python source
Python源代码
image.add_local_python_source("my_module")
image.add_local_python_source("my_module")
Environment variables
环境变量
image.env({"VAR": "value"})
undefinedimage.env({"VAR": "value"})
undefinedBuild-Time Function
构建时函数
python
def download_model():
from huggingface_hub import snapshot_download
snapshot_download("model-name")
image.run_function(download_model, secrets=[...])python
def download_model():
from huggingface_hub import snapshot_download
snapshot_download("model-name")
image.run_function(download_model, secrets=[...])Storage
存储
Volumes
卷
python
undefinedpython
undefinedCreate/reference volume
创建/引用卷
vol = modal.Volume.from_name("my-vol", create_if_missing=True)
vol = modal.Volume.from_name("my-vol", create_if_missing=True)
Mount in function
在函数中挂载
@app.function(volumes={"/data": vol})
def func():
# Read/write to /data
vol.commit() # Persist changes
undefined@app.function(volumes={"/data": vol})
def func():
# 读写/data目录
vol.commit() # 持久化更改
undefinedSecrets
密钥
python
undefinedpython
undefinedFrom dashboard (recommended)
从控制台获取(推荐)
modal.Secret.from_name("secret-name")
modal.Secret.from_name("secret-name")
From dictionary
从字典创建
modal.Secret.from_dict({"KEY": "value"})
modal.Secret.from_dict({"KEY": "value"})
From local env
从本地环境变量获取
modal.Secret.from_local_environ(["KEY1", "KEY2"])
modal.Secret.from_local_environ(["KEY1", "KEY2"])
From .env file
从.env文件获取
modal.Secret.from_dotenv()
modal.Secret.from_dotenv()
Usage
使用方式
@app.function(secrets=[modal.Secret.from_name("api-keys")])
def func():
import os
key = os.environ["API_KEY"]
undefined@app.function(secrets=[modal.Secret.from_name("api-keys")])
def func():
import os
key = os.environ["API_KEY"]
undefinedDict and Queue
分布式字典与队列
python
undefinedpython
undefinedDistributed dict
分布式字典
d = modal.Dict.from_name("cache", create_if_missing=True)
d["key"] = "value"
d.put("key", "value", ttl=3600)
d = modal.Dict.from_name("cache", create_if_missing=True)
d["key"] = "value"
d.put("key", "value", ttl=3600)
Distributed queue
分布式队列
q = modal.Queue.from_name("jobs", create_if_missing=True)
q.put("task")
item = q.get()
---q = modal.Queue.from_name("jobs", create_if_missing=True)
q.put("task")
item = q.get()
---Web Endpoints
Web端点
FastAPI Endpoint (Simple)
FastAPI端点(简易版)
python
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
return {"message": f"Hello, {name}!"}python
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
return {"message": f"Hello, {name}!"}ASGI App (Full FastAPI)
ASGI应用(完整FastAPI)
python
from fastapi import FastAPI
web_app = FastAPI()
@web_app.post("/predict")
def predict(text: str):
return {"result": process(text)}
@app.function()
@modal.asgi_app()
def fastapi_app():
return web_apppython
from fastapi import FastAPI
web_app = FastAPI()
@web_app.post("/predict")
def predict(text: str):
return {"result": process(text)}
@app.function()
@modal.asgi_app()
def fastapi_app():
return web_appWSGI App (Flask)
WSGI应用(Flask)
python
from flask import Flask
flask_app = Flask(__name__)
@app.function()
@modal.wsgi_app()
def flask_endpoint():
return flask_apppython
from flask import Flask
flask_app = Flask(__name__)
@app.function()
@modal.wsgi_app()
def flask_endpoint():
return flask_appCustom Web Server
自定义Web服务器
python
@app.function()
@modal.web_server(port=8000)
def custom_server():
subprocess.run(["python", "-m", "http.server", "8000"])python
@app.function()
@modal.web_server(port=8000)
def custom_server():
subprocess.run(["python", "-m", "http.server", "8000"])Custom Domains
自定义域名
python
@modal.asgi_app(custom_domains=["api.example.com"])python
@modal.asgi_app(custom_domains=["api.example.com"])Scheduling
任务调度
Cron
Cron定时任务
python
undefinedpython
undefinedDaily at 8 AM UTC
每天UTC时间8点执行
@app.function(schedule=modal.Cron("0 8 * * *"))
@app.function(schedule=modal.Cron("0 8 * * *"))
With timezone
指定时区
@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
undefined@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
undefinedPeriod
周期任务
python
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))Note: Scheduled functions only run with , not .
modal deploymodal runpython
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))注意: 定时函数仅在命令下运行,不会执行定时任务。
modal deploymodal runParallel Processing
并行处理
Map
Map并行
python
undefinedpython
undefinedParallel execution (up to 1000 concurrent)
并行执行(最多1000并发)
results = list(func.map(items))
results = list(func.map(items))
Unordered (faster)
无序返回(速度更快)
results = list(func.map(items, order_outputs=False))
undefinedresults = list(func.map(items, order_outputs=False))
undefinedStarmap
Starmap并行
python
undefinedpython
undefinedSpread args
拆分参数
pairs = [(1, 2), (3, 4)]
results = list(add.starmap(pairs))
undefinedpairs = [(1, 2), (3, 4)]
results = list(add.starmap(pairs))
undefinedSpawn
Spawn异步任务
python
undefinedpython
undefinedAsync job (returns immediately)
异步任务(立即返回)
call = func.spawn(data)
result = call.get() # Get result later
call = func.spawn(data)
result = call.get() # 后续获取结果
Spawn many
批量创建异步任务
calls = [func.spawn(item) for item in items]
results = [call.get() for call in calls]
---calls = [func.spawn(item) for item in items]
results = [call.get() for call in calls]
---Container Lifecycle (Classes)
容器生命周期(类模式)
python
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:
@modal.enter()
def load(self):
self.model = load_model()
@modal.method()
def predict(self, text):
return self.model(text)
@modal.exit()
def cleanup(self):
del self.modelpython
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:
@modal.enter()
def load(self):
self.model = load_model()
@modal.method()
def predict(self, text):
return self.model(text)
@modal.exit()
def cleanup(self):
del self.modelConcurrency
并发配置
python
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
passpython
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
passCLI Commands
CLI命令
Development
开发相关
bash
modal run app.py # Run function
modal serve app.py # Hot-reload dev server
modal shell app.py # Interactive shell
modal shell app.py --gpu A100 # Shell with GPUbash
modal run app.py # 运行函数
modal serve app.py # 热重载开发服务器
modal shell app.py # 交互式Shell
modal shell app.py --gpu A100 # 带GPU的ShellDeployment
部署相关
bash
modal deploy app.py # Deploy
modal app list # List apps
modal app logs app-name # View logs
modal app stop app-name # Stop appbash
modal deploy app.py # 部署应用
modal app list # 列出应用
modal app logs app-name # 查看应用日志
modal app stop app-name # 停止应用Resources
资源管理
bash
undefinedbash
undefinedVolumes
卷管理
modal volume create name
modal volume list
modal volume put name local remote
modal volume get name remote local
modal volume create name
modal volume list
modal volume put name local remote
modal volume get name remote local
Secrets
密钥管理
modal secret create name KEY=value
modal secret list
modal secret create name KEY=value
modal secret list
Environments
环境管理
modal environment create staging
---modal environment create staging
---Pricing (2025)
定价(2025年)
Plans
套餐方案
| Plan | Price | Containers | GPU Concurrency |
|---|---|---|---|
| Starter | Free ($30 credits) | 100 | 10 |
| Team | $250/month | 1000 | 50 |
| Enterprise | Custom | Unlimited | Custom |
| 套餐 | 价格 | 容器数量 | GPU并发数 |
|---|---|---|---|
| 入门版 | 免费(含30美元额度) | 100 | 10 |
| 团队版 | 250美元/月 | 1000 | 50 |
| 企业版 | 定制价格 | 无限制 | 定制 |
Compute
计算资源定价
- CPU: $0.0000131/core/sec
- Memory: $0.00000222/GiB/sec
- GPUs: See GPU table above
- CPU:$0.0000131/核心/秒
- 内存:$0.00000222/ GiB/秒
- GPU:见上方GPU参考表格
Special Programs
特殊计划
- Startups: Up to $25k credits
- Researchers: Up to $10k credits
- 创业公司:最高可获25000美元额度
- 研究人员:最高可获10000美元额度
Best Practices
最佳实践
- Use for model loading
@modal.enter() - Use for faster builds
uv_pip_install - Use GPU fallbacks for availability
- Set appropriate timeouts and retries
- Use environments (dev/staging/prod)
- Download models during build, not runtime
- Use when order doesn't matter
order_outputs=False - Set to balance cost/latency
container_idle_timeout - Monitor costs in Modal dashboard
- Test with before
modal runmodal deploy
- 使用加载模型
@modal.enter() - 使用加快镜像构建速度
uv_pip_install - 配置GPU备选方案 提升可用性
- 设置合适的超时时间与重试次数
- 使用多环境(开发/预发布/生产)
- 在构建阶段下载模型,而非运行时
- 当结果顺序无关时,启用
order_outputs=False - 设置平衡成本与延迟
container_idle_timeout - 在Modal控制台监控成本
- 部署前使用测试
modal run
Common Patterns
常见使用模式
LLM Inference
大语言模型推理
python
@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
@modal.enter()
def load(self):
from vllm import LLM
self.llm = LLM(model="...")
@modal.method()
def generate(self, prompt):
return self.llm.generate([prompt])python
@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
@modal.enter()
def load(self):
from vllm import LLM
self.llm = LLM(model="...")
@modal.method()
def generate(self, prompt):
return self.llm.generate([prompt])Batch Processing
批处理
python
@app.function(volumes={"/data": vol})
def process(file):
# Process file
vol.commit()python
@app.function(volumes={"/data": vol})
def process(file):
# 处理文件
vol.commit()Parallel
并行处理
results = list(process.map(files))
undefinedresults = list(process.map(files))
undefinedScheduled ETL
定时ETL任务
python
@app.function(
schedule=modal.Cron("0 6 * * *"),
secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
extract()
transform()
load()python
@app.function(
schedule=modal.Cron("0 6 * * *"),
secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
extract()
transform()
load()Quick Reference
快速参考
| Task | Code |
|---|---|
| Create app | |
| Basic function | |
| With GPU | |
| With image | |
| Web endpoint | |
| Scheduled | |
| Mount volume | |
| Use secret | |
| Parallel map | |
| Async spawn | |
| Class pattern | |
| 任务 | 代码 |
|---|---|
| 创建应用 | |
| 基础函数 | |
| 配置GPU | |
| 配置镜像 | |
| Web端点 | |
| 定时任务 | |
| 挂载卷 | |
| 使用密钥 | |
| 并行Map | |
| 异步Spawn | |
| 类模式 | |