modal-knowledge

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Modal Knowledge Skill

Modal平台知识技能

Comprehensive Modal.com platform knowledge covering all features, pricing, and best practices. Activate this skill when users need detailed information about Modal's serverless cloud platform.

涵盖所有功能、定价和最佳实践的Modal.com平台全面知识。当用户需要了解Modal无服务器云平台的详细信息时，激活此技能。

Activation Triggers

激活触发条件

Activate this skill when users ask about:

Modal.com platform features and capabilities
GPU-accelerated Python functions
Serverless container configuration
Modal pricing and billing
Modal CLI commands
Web endpoints and APIs on Modal
Scheduled/cron jobs on Modal
Modal volumes, secrets, and storage
Parallel processing with Modal
Modal deployment and CI/CD

当用户询问以下内容时激活此技能：

Modal.com平台功能与能力
GPU加速Python函数
无服务器容器配置
Modal定价与计费
Modal CLI命令
Modal上的Web端点与API
Modal上的定时/ cron任务
Modal卷、密钥与存储
使用Modal进行并行处理
Modal部署与CI/CD

Platform Overview

平台概述

Modal is a serverless cloud platform for running Python code, optimized for AI/ML workloads with:

Zero Configuration: Everything defined in Python code
Fast GPU Startup: ~1 second container spin-up
Automatic Scaling: Scale to zero, scale to thousands
Per-Second Billing: Only pay for active compute
Multi-Cloud: AWS, GCP, Oracle Cloud Infrastructure

Modal是一个用于运行Python代码的无服务器云平台，针对AI/ML工作负载优化，具备以下特性：

零配置：所有配置均通过Python代码定义
快速GPU启动：约1秒完成容器启动
自动扩缩容：支持缩容至零，也可扩容至数千实例
按秒计费：仅为活跃计算资源付费
多云支持：AWS、GCP、Oracle Cloud Infrastructure

Core Components Reference

核心组件参考

Apps and Functions

应用与函数

python

import modal

app = modal.App("app-name")

@app.function()
def basic_function(arg: str) -> str:
    return f"Result: {arg}"

@app.local_entrypoint()
def main():
    result = basic_function.remote("test")
    print(result)

python

import modal

app = modal.App("app-name")

@app.function()
def basic_function(arg: str) -> str:
    return f"Result: {arg}"

@app.local_entrypoint()
def main():
    result = basic_function.remote("test")
    print(result)

Function Decorator Parameters

函数装饰器参数

Parameter	Type	Description
`image`	Image	Container image configuration
`gpu`	str/list	GPU type(s): "T4", "A100", ["H100", "A100"]
`cpu`	float	CPU cores (0.125 to 64)
`memory`	int	Memory in MB (128 to 262144)
`timeout`	int	Max execution seconds
`retries`	int	Retry attempts on failure
`secrets`	list	Secrets to inject
`volumes`	dict	Volume mount points
`schedule`	Cron/Period	Scheduled execution
`concurrency_limit`	int	Max concurrent executions
`container_idle_timeout`	int	Seconds to keep warm
`include_source`	bool	Auto-sync source code

参数	类型	描述
`image`	Image	容器镜像配置
`gpu`	str/list	GPU类型："T4", "A100", ["H100", "A100"]
`cpu`	float	CPU核心数（0.125至64）
`memory`	int	内存大小（MB，128至262144）
`timeout`	int	最大执行时长（秒）
`retries`	int	失败重试次数
`secrets`	list	注入的密钥
`volumes`	dict	卷挂载点
`schedule`	Cron/Period	定时执行配置
`concurrency_limit`	int	最大并发执行数
`container_idle_timeout`	int	容器闲置保留时长（秒）
`include_source`	bool	自动同步源代码

GPU Reference

GPU参考

Available GPUs

可用GPU

GPU	Memory	Use Case	~Cost/hr
T4	16 GB	Small inference	$0.59
L4	24 GB	Medium inference	$0.80
A10G	24 GB	Inference/fine-tuning	$1.10
L40S	48 GB	Heavy inference	$1.50
A100-40GB	40 GB	Training	$2.00
A100-80GB	80 GB	Large models	$3.00
H100	80 GB	Cutting-edge	$5.00
H200	141 GB	Largest models	$5.00
B200	180+ GB	Latest gen	$6.25

GPU	内存	适用场景	每小时约费用
T4	16 GB	小型推理任务	$0.59
L4	24 GB	中型推理任务	$0.80
A10G	24 GB	推理/微调任务	$1.10
L40S	48 GB	重型推理任务	$1.50
A100-40GB	40 GB	训练任务	$2.00
A100-80GB	80 GB	大模型任务	$3.00
H100	80 GB	前沿任务	$5.00
H200	141 GB	超大模型任务	$5.00
B200	180+ GB	最新一代模型任务	$6.25

GPU Configuration

GPU配置

python

undefined

python

undefined

Single GPU

单GPU

@app.function(gpu="A100")

Specific memory variant

指定内存版本

@app.function(gpu="A100-80GB")

Multi-GPU

多GPU

@app.function(gpu="H100:4")

Fallbacks (tries in order)

备选方案（按顺序尝试）

@app.function(gpu=["H100", "A100", "any"])

"any" = L4, A10G, or T4

"any" = L4、A10G或T4

@app.function(gpu="any")

---

@app.function(gpu="any")

---

Image Building

镜像构建

Base Images

基础镜像

python

undefined

python

undefined

Debian slim (recommended)

Debian slim（推荐）

modal.Image.debian_slim(python_version="3.11")

From Dockerfile

基于Dockerfile

modal.Image.from_dockerfile("./Dockerfile")

From Docker registry

来自Docker镜像仓库

modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")

undefined

modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")

undefined

Package Installation

包安装

python

undefined

python

undefined

pip (standard)

pip（标准方式）

image.pip_install("torch", "transformers")

uv (FASTER - 10-100x)

uv（更快 - 10-100倍速度）

image.uv_pip_install("torch", "transformers")

System packages

系统包

image.apt_install("ffmpeg", "libsm6")

Shell commands

Shell命令

image.run_commands("apt-get update", "make install")

undefined

image.run_commands("apt-get update", "make install")

undefined

Adding Files

添加文件

python

undefined

python

undefined

Single file

单个文件

image.add_local_file("./config.json", "/app/config.json")

Python source

Python源代码

image.add_local_python_source("my_module")

Environment variables

环境变量

image.env({"VAR": "value"})

undefined

image.env({"VAR": "value"})

undefined

Build-Time Function

构建时函数

python

def download_model():
    from huggingface_hub import snapshot_download
    snapshot_download("model-name")

image.run_function(download_model, secrets=[...])

python

def download_model():
    from huggingface_hub import snapshot_download
    snapshot_download("model-name")

image.run_function(download_model, secrets=[...])

Storage

存储

Volumes

卷

python

undefined

python

undefined

Create/reference volume

创建/引用卷

vol = modal.Volume.from_name("my-vol", create_if_missing=True)

Mount in function

在函数中挂载

@app.function(volumes={"/data": vol}) def func(): # Read/write to /data vol.commit() # Persist changes

undefined

@app.function(volumes={"/data": vol}) def func(): # 读写/data目录 vol.commit() # 持久化更改

undefined

Secrets

密钥

python

undefined

python

undefined

From dashboard (recommended)

从控制台获取（推荐）

modal.Secret.from_name("secret-name")

From dictionary

从字典创建

modal.Secret.from_dict({"KEY": "value"})

From local env

从本地环境变量获取

modal.Secret.from_local_environ(["KEY1", "KEY2"])

From .env file

从.env文件获取

modal.Secret.from_dotenv()

Usage

使用方式

@app.function(secrets=[modal.Secret.from_name("api-keys")]) def func(): import os key = os.environ["API_KEY"]

undefined

@app.function(secrets=[modal.Secret.from_name("api-keys")]) def func(): import os key = os.environ["API_KEY"]

undefined

Dict and Queue

分布式字典与队列

python

undefined

python

undefined

Distributed dict

分布式字典

d = modal.Dict.from_name("cache", create_if_missing=True) d["key"] = "value" d.put("key", "value", ttl=3600)

Distributed queue

分布式队列

q = modal.Queue.from_name("jobs", create_if_missing=True) q.put("task") item = q.get()

---

q = modal.Queue.from_name("jobs", create_if_missing=True) q.put("task") item = q.get()

---

Web Endpoints

Web端点

FastAPI Endpoint (Simple)

FastAPI端点（简易版）

python

@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
    return {"message": f"Hello, {name}!"}

python

@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
    return {"message": f"Hello, {name}!"}

ASGI App (Full FastAPI)

ASGI应用（完整FastAPI）

python

from fastapi import FastAPI
web_app = FastAPI()

@web_app.post("/predict")
def predict(text: str):
    return {"result": process(text)}

@app.function()
@modal.asgi_app()
def fastapi_app():
    return web_app

python

from fastapi import FastAPI
web_app = FastAPI()

@web_app.post("/predict")
def predict(text: str):
    return {"result": process(text)}

@app.function()
@modal.asgi_app()
def fastapi_app():
    return web_app

WSGI App (Flask)

WSGI应用（Flask）

python

from flask import Flask
flask_app = Flask(__name__)

@app.function()
@modal.wsgi_app()
def flask_endpoint():
    return flask_app

python

from flask import Flask
flask_app = Flask(__name__)

@app.function()
@modal.wsgi_app()
def flask_endpoint():
    return flask_app

Custom Web Server

自定义Web服务器

python

@app.function()
@modal.web_server(port=8000)
def custom_server():
    subprocess.run(["python", "-m", "http.server", "8000"])

python

@app.function()
@modal.web_server(port=8000)
def custom_server():
    subprocess.run(["python", "-m", "http.server", "8000"])

Custom Domains

自定义域名

python

@modal.asgi_app(custom_domains=["api.example.com"])

python

@modal.asgi_app(custom_domains=["api.example.com"])

Scheduling

任务调度

Cron

Cron定时任务

python

undefined

python

undefined

Daily at 8 AM UTC

每天UTC时间8点执行

@app.function(schedule=modal.Cron("0 8 * * *"))

With timezone

指定时区

@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))

undefined

@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))

undefined

Period

周期任务

python

@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))

Note: Scheduled functions only run with

modal deploy

, not

modal run

python

@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))

注意： 定时函数仅在

modal deploy

命令下运行，

modal run

不会执行定时任务。

Parallel Processing

并行处理

Map

Map并行

python

undefined

python

undefined

Parallel execution (up to 1000 concurrent)

并行执行（最多1000并发）

results = list(func.map(items))

Unordered (faster)

无序返回（速度更快）

results = list(func.map(items, order_outputs=False))

undefined

results = list(func.map(items, order_outputs=False))

undefined

Starmap

Starmap并行

python

undefined

python

undefined

Spread args

拆分参数

pairs = [(1, 2), (3, 4)] results = list(add.starmap(pairs))

undefined

pairs = [(1, 2), (3, 4)] results = list(add.starmap(pairs))

undefined

Spawn

Spawn异步任务

python

undefined

python

undefined

Async job (returns immediately)

异步任务（立即返回）

call = func.spawn(data) result = call.get() # Get result later

call = func.spawn(data) result = call.get() # 后续获取结果

Spawn many

批量创建异步任务

calls = [func.spawn(item) for item in items] results = [call.get() for call in calls]

---

calls = [func.spawn(item) for item in items] results = [call.get() for call in calls]

---

Container Lifecycle (Classes)

容器生命周期（类模式）

python

@app.cls(gpu="A100", container_idle_timeout=300)
class Server:

    @modal.enter()
    def load(self):
        self.model = load_model()

    @modal.method()
    def predict(self, text):
        return self.model(text)

    @modal.exit()
    def cleanup(self):
        del self.model

python

@app.cls(gpu="A100", container_idle_timeout=300)
class Server:

    @modal.enter()
    def load(self):
        self.model = load_model()

    @modal.method()
    def predict(self, text):
        return self.model(text)

    @modal.exit()
    def cleanup(self):
        del self.model

Concurrency

并发配置

python

@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
    pass

python

@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
    pass

CLI Commands

CLI命令

Development

开发相关

bash

modal run app.py              # Run function
modal serve app.py            # Hot-reload dev server
modal shell app.py            # Interactive shell
modal shell app.py --gpu A100 # Shell with GPU

bash

modal run app.py              # 运行函数
modal serve app.py            # 热重载开发服务器
modal shell app.py            # 交互式Shell
modal shell app.py --gpu A100 # 带GPU的Shell

Deployment

部署相关

bash

modal deploy app.py           # Deploy
modal app list                # List apps
modal app logs app-name       # View logs
modal app stop app-name       # Stop app

bash

modal deploy app.py           # 部署应用
modal app list                # 列出应用
modal app logs app-name       # 查看应用日志
modal app stop app-name       # 停止应用

Resources

资源管理

bash

undefined

bash

undefined

Volumes

卷管理

modal volume create name modal volume list modal volume put name local remote modal volume get name remote local

Secrets

密钥管理

modal secret create name KEY=value modal secret list

Environments

环境管理

modal environment create staging

---

modal environment create staging

---

Pricing (2025)

定价（2025年）

Plans

套餐方案

Plan	Price	Containers	GPU Concurrency
Starter	Free ($30 credits)	100	10
Team	$250/month	1000	50
Enterprise	Custom	Unlimited	Custom

套餐	价格	容器数量	GPU并发数
入门版	免费（含30美元额度）	100	10
团队版	250美元/月	1000	50
企业版	定制价格	无限制	定制

Compute

计算资源定价

CPU: $0.0000131/core/sec
Memory: $0.00000222/GiB/sec
GPUs: See GPU table above

CPU：$0.0000131/核心/秒
内存：$0.00000222/ GiB/秒
GPU：见上方GPU参考表格

Special Programs

特殊计划

Startups: Up to $25k credits
Researchers: Up to $10k credits

创业公司：最高可获25000美元额度
研究人员：最高可获10000美元额度

Best Practices

最佳实践

Use
@modal.enter()
for model loading
Use
uv_pip_install
for faster builds
Use GPU fallbacks for availability
Set appropriate timeouts and retries
Use environments (dev/staging/prod)
Download models during build, not runtime
Use
order_outputs=False
when order doesn't matter
Set
container_idle_timeout
to balance cost/latency
Monitor costs in Modal dashboard
Test with
modal run
before
```
modal deploy
```

使用
@modal.enter()
加载模型
使用
uv_pip_install
加快镜像构建速度
配置GPU备选方案 提升可用性
设置合适的超时时间与重试次数
使用多环境（开发/预发布/生产）
在构建阶段下载模型，而非运行时
当结果顺序无关时，启用
order_outputs=False
设置
container_idle_timeout
平衡成本与延迟
在Modal控制台监控成本
部署前使用
modal run
测试

Common Patterns

常见使用模式

LLM Inference

大语言模型推理

python

@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
    @modal.enter()
    def load(self):
        from vllm import LLM
        self.llm = LLM(model="...")

    @modal.method()
    def generate(self, prompt):
        return self.llm.generate([prompt])

python

@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
    @modal.enter()
    def load(self):
        from vllm import LLM
        self.llm = LLM(model="...")

    @modal.method()
    def generate(self, prompt):
        return self.llm.generate([prompt])

Batch Processing

批处理

python

@app.function(volumes={"/data": vol})
def process(file):
    # Process file
    vol.commit()

python

@app.function(volumes={"/data": vol})
def process(file):
    # 处理文件
    vol.commit()

Parallel

并行处理

results = list(process.map(files))

undefined

results = list(process.map(files))

undefined

Scheduled ETL

定时ETL任务

python

@app.function(
    schedule=modal.Cron("0 6 * * *"),
    secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
    extract()
    transform()
    load()

python

@app.function(
    schedule=modal.Cron("0 6 * * *"),
    secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
    extract()
    transform()
    load()

Quick Reference

快速参考

Task	Code
Create app	`app = modal.App("name")`
Basic function	`@app.function()`
With GPU	`@app.function(gpu="A100")`
With image	`@app.function(image=img)`
Web endpoint	`@modal.asgi_app()`
Scheduled	`schedule=modal.Cron("...")`
Mount volume	`volumes={"/path": vol}`
Use secret	`secrets=[modal.Secret.from_name("x")]`
Parallel map	`func.map(items)`
Async spawn	`func.spawn(arg)`
Class pattern	`@app.cls()` with `@modal.enter()`

任务	代码
创建应用	`app = modal.App("name")`
基础函数	`@app.function()`
配置GPU	`@app.function(gpu="A100")`
配置镜像	`@app.function(image=img)`
Web端点	`@modal.asgi_app()`
定时任务	`schedule=modal.Cron("...")`
挂载卷	`volumes={"/path": vol}`
使用密钥	`secrets=[modal.Secret.from_name("x")]`
并行Map	`func.map(items)`
异步Spawn	`func.spawn(arg)`
类模式	`@app.cls()` 搭配 `@modal.enter()`

modal-knowledge

Original

Translation

Modal Knowledge Skill

Modal平台知识技能

Activation Triggers

激活触发条件

Platform Overview

平台概述

Core Components Reference

核心组件参考

Apps and Functions

应用与函数

Function Decorator Parameters

函数装饰器参数

GPU Reference

GPU参考

Available GPUs

可用GPU

GPU Configuration

GPU配置

Single GPU

单GPU

Specific memory variant

指定内存版本

Multi-GPU

多GPU

Fallbacks (tries in order)

备选方案（按顺序尝试）

"any" = L4, A10G, or T4

"any" = L4、A10G或T4

Image Building

镜像构建

Base Images

基础镜像

Debian slim (recommended)

Debian slim（推荐）

From Dockerfile

基于Dockerfile

From Docker registry

来自Docker镜像仓库

Package Installation

包安装

pip (standard)

pip（标准方式）

uv (FASTER - 10-100x)

uv（更快 - 10-100倍速度）

System packages

系统包

Shell commands

Shell命令

Adding Files

添加文件

Single file

单个文件

Directory

目录

Python source

Python源代码

Environment variables

环境变量

Build-Time Function

构建时函数

Storage

存储

Volumes

卷

Create/reference volume

创建/引用卷

Mount in function

在函数中挂载

Secrets

密钥

From dashboard (recommended)

从控制台获取（推荐）

From dictionary

从字典创建

From local env

从本地环境变量获取

From .env file