modal-knowledge

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Modal Knowledge Skill

Modal平台知识技能

Comprehensive Modal.com platform knowledge covering all features, pricing, and best practices. Activate this skill when users need detailed information about Modal's serverless cloud platform.
涵盖所有功能、定价和最佳实践的Modal.com平台全面知识。当用户需要了解Modal无服务器云平台的详细信息时,激活此技能。

Activation Triggers

激活触发条件

Activate this skill when users ask about:
  • Modal.com platform features and capabilities
  • GPU-accelerated Python functions
  • Serverless container configuration
  • Modal pricing and billing
  • Modal CLI commands
  • Web endpoints and APIs on Modal
  • Scheduled/cron jobs on Modal
  • Modal volumes, secrets, and storage
  • Parallel processing with Modal
  • Modal deployment and CI/CD

当用户询问以下内容时激活此技能:
  • Modal.com平台功能与能力
  • GPU加速Python函数
  • 无服务器容器配置
  • Modal定价与计费
  • Modal CLI命令
  • Modal上的Web端点与API
  • Modal上的定时/ cron任务
  • Modal卷、密钥与存储
  • 使用Modal进行并行处理
  • Modal部署与CI/CD

Platform Overview

平台概述

Modal is a serverless cloud platform for running Python code, optimized for AI/ML workloads with:
  • Zero Configuration: Everything defined in Python code
  • Fast GPU Startup: ~1 second container spin-up
  • Automatic Scaling: Scale to zero, scale to thousands
  • Per-Second Billing: Only pay for active compute
  • Multi-Cloud: AWS, GCP, Oracle Cloud Infrastructure

Modal是一个用于运行Python代码的无服务器云平台,针对AI/ML工作负载优化,具备以下特性:
  • 零配置:所有配置均通过Python代码定义
  • 快速GPU启动:约1秒完成容器启动
  • 自动扩缩容:支持缩容至零,也可扩容至数千实例
  • 按秒计费:仅为活跃计算资源付费
  • 多云支持:AWS、GCP、Oracle Cloud Infrastructure

Core Components Reference

核心组件参考

Apps and Functions

应用与函数

python
import modal

app = modal.App("app-name")

@app.function()
def basic_function(arg: str) -> str:
    return f"Result: {arg}"

@app.local_entrypoint()
def main():
    result = basic_function.remote("test")
    print(result)
python
import modal

app = modal.App("app-name")

@app.function()
def basic_function(arg: str) -> str:
    return f"Result: {arg}"

@app.local_entrypoint()
def main():
    result = basic_function.remote("test")
    print(result)

Function Decorator Parameters

函数装饰器参数

ParameterTypeDescription
image
ImageContainer image configuration
gpu
str/listGPU type(s): "T4", "A100", ["H100", "A100"]
cpu
floatCPU cores (0.125 to 64)
memory
intMemory in MB (128 to 262144)
timeout
intMax execution seconds
retries
intRetry attempts on failure
secrets
listSecrets to inject
volumes
dictVolume mount points
schedule
Cron/PeriodScheduled execution
concurrency_limit
intMax concurrent executions
container_idle_timeout
intSeconds to keep warm
include_source
boolAuto-sync source code

参数类型描述
image
Image容器镜像配置
gpu
str/listGPU类型:"T4", "A100", ["H100", "A100"]
cpu
floatCPU核心数(0.125至64)
memory
int内存大小(MB,128至262144)
timeout
int最大执行时长(秒)
retries
int失败重试次数
secrets
list注入的密钥
volumes
dict卷挂载点
schedule
Cron/Period定时执行配置
concurrency_limit
int最大并发执行数
container_idle_timeout
int容器闲置保留时长(秒)
include_source
bool自动同步源代码

GPU Reference

GPU参考

Available GPUs

可用GPU

GPUMemoryUse Case~Cost/hr
T416 GBSmall inference$0.59
L424 GBMedium inference$0.80
A10G24 GBInference/fine-tuning$1.10
L40S48 GBHeavy inference$1.50
A100-40GB40 GBTraining$2.00
A100-80GB80 GBLarge models$3.00
H10080 GBCutting-edge$5.00
H200141 GBLargest models$5.00
B200180+ GBLatest gen$6.25
GPU内存适用场景每小时约费用
T416 GB小型推理任务$0.59
L424 GB中型推理任务$0.80
A10G24 GB推理/微调任务$1.10
L40S48 GB重型推理任务$1.50
A100-40GB40 GB训练任务$2.00
A100-80GB80 GB大模型任务$3.00
H10080 GB前沿任务$5.00
H200141 GB超大模型任务$5.00
B200180+ GB最新一代模型任务$6.25

GPU Configuration

GPU配置

python
undefined
python
undefined

Single GPU

单GPU

@app.function(gpu="A100")
@app.function(gpu="A100")

Specific memory variant

指定内存版本

@app.function(gpu="A100-80GB")
@app.function(gpu="A100-80GB")

Multi-GPU

多GPU

@app.function(gpu="H100:4")
@app.function(gpu="H100:4")

Fallbacks (tries in order)

备选方案(按顺序尝试)

@app.function(gpu=["H100", "A100", "any"])
@app.function(gpu=["H100", "A100", "any"])

"any" = L4, A10G, or T4

"any" = L4、A10G或T4

@app.function(gpu="any")

---
@app.function(gpu="any")

---

Image Building

镜像构建

Base Images

基础镜像

python
undefined
python
undefined

Debian slim (recommended)

Debian slim(推荐)

modal.Image.debian_slim(python_version="3.11")
modal.Image.debian_slim(python_version="3.11")

From Dockerfile

基于Dockerfile

modal.Image.from_dockerfile("./Dockerfile")
modal.Image.from_dockerfile("./Dockerfile")

From Docker registry

来自Docker镜像仓库

modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
undefined
modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
undefined

Package Installation

包安装

python
undefined
python
undefined

pip (standard)

pip(标准方式)

image.pip_install("torch", "transformers")
image.pip_install("torch", "transformers")

uv (FASTER - 10-100x)

uv(更快 - 10-100倍速度)

image.uv_pip_install("torch", "transformers")
image.uv_pip_install("torch", "transformers")

System packages

系统包

image.apt_install("ffmpeg", "libsm6")
image.apt_install("ffmpeg", "libsm6")

Shell commands

Shell命令

image.run_commands("apt-get update", "make install")
undefined
image.run_commands("apt-get update", "make install")
undefined

Adding Files

添加文件

python
undefined
python
undefined

Single file

单个文件

image.add_local_file("./config.json", "/app/config.json")
image.add_local_file("./config.json", "/app/config.json")

Directory

目录

image.add_local_dir("./models", "/app/models")
image.add_local_dir("./models", "/app/models")

Python source

Python源代码

image.add_local_python_source("my_module")
image.add_local_python_source("my_module")

Environment variables

环境变量

image.env({"VAR": "value"})
undefined
image.env({"VAR": "value"})
undefined

Build-Time Function

构建时函数

python
def download_model():
    from huggingface_hub import snapshot_download
    snapshot_download("model-name")

image.run_function(download_model, secrets=[...])

python
def download_model():
    from huggingface_hub import snapshot_download
    snapshot_download("model-name")

image.run_function(download_model, secrets=[...])

Storage

存储

Volumes

python
undefined
python
undefined

Create/reference volume

创建/引用卷

vol = modal.Volume.from_name("my-vol", create_if_missing=True)
vol = modal.Volume.from_name("my-vol", create_if_missing=True)

Mount in function

在函数中挂载

@app.function(volumes={"/data": vol}) def func(): # Read/write to /data vol.commit() # Persist changes
undefined
@app.function(volumes={"/data": vol}) def func(): # 读写/data目录 vol.commit() # 持久化更改
undefined

Secrets

密钥

python
undefined
python
undefined

From dashboard (recommended)

从控制台获取(推荐)

modal.Secret.from_name("secret-name")
modal.Secret.from_name("secret-name")

From dictionary

从字典创建

modal.Secret.from_dict({"KEY": "value"})
modal.Secret.from_dict({"KEY": "value"})

From local env

从本地环境变量获取

modal.Secret.from_local_environ(["KEY1", "KEY2"])
modal.Secret.from_local_environ(["KEY1", "KEY2"])

From .env file

从.env文件获取

modal.Secret.from_dotenv()
modal.Secret.from_dotenv()

Usage

使用方式

@app.function(secrets=[modal.Secret.from_name("api-keys")]) def func(): import os key = os.environ["API_KEY"]
undefined
@app.function(secrets=[modal.Secret.from_name("api-keys")]) def func(): import os key = os.environ["API_KEY"]
undefined

Dict and Queue

分布式字典与队列

python
undefined
python
undefined

Distributed dict

分布式字典

d = modal.Dict.from_name("cache", create_if_missing=True) d["key"] = "value" d.put("key", "value", ttl=3600)
d = modal.Dict.from_name("cache", create_if_missing=True) d["key"] = "value" d.put("key", "value", ttl=3600)

Distributed queue

分布式队列

q = modal.Queue.from_name("jobs", create_if_missing=True) q.put("task") item = q.get()

---
q = modal.Queue.from_name("jobs", create_if_missing=True) q.put("task") item = q.get()

---

Web Endpoints

Web端点

FastAPI Endpoint (Simple)

FastAPI端点(简易版)

python
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
    return {"message": f"Hello, {name}!"}
python
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
    return {"message": f"Hello, {name}!"}

ASGI App (Full FastAPI)

ASGI应用(完整FastAPI)

python
from fastapi import FastAPI
web_app = FastAPI()

@web_app.post("/predict")
def predict(text: str):
    return {"result": process(text)}

@app.function()
@modal.asgi_app()
def fastapi_app():
    return web_app
python
from fastapi import FastAPI
web_app = FastAPI()

@web_app.post("/predict")
def predict(text: str):
    return {"result": process(text)}

@app.function()
@modal.asgi_app()
def fastapi_app():
    return web_app

WSGI App (Flask)

WSGI应用(Flask)

python
from flask import Flask
flask_app = Flask(__name__)

@app.function()
@modal.wsgi_app()
def flask_endpoint():
    return flask_app
python
from flask import Flask
flask_app = Flask(__name__)

@app.function()
@modal.wsgi_app()
def flask_endpoint():
    return flask_app

Custom Web Server

自定义Web服务器

python
@app.function()
@modal.web_server(port=8000)
def custom_server():
    subprocess.run(["python", "-m", "http.server", "8000"])
python
@app.function()
@modal.web_server(port=8000)
def custom_server():
    subprocess.run(["python", "-m", "http.server", "8000"])

Custom Domains

自定义域名

python
@modal.asgi_app(custom_domains=["api.example.com"])

python
@modal.asgi_app(custom_domains=["api.example.com"])

Scheduling

任务调度

Cron

Cron定时任务

python
undefined
python
undefined

Daily at 8 AM UTC

每天UTC时间8点执行

@app.function(schedule=modal.Cron("0 8 * * *"))
@app.function(schedule=modal.Cron("0 8 * * *"))

With timezone

指定时区

@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
undefined
@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
undefined

Period

周期任务

python
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))
Note: Scheduled functions only run with
modal deploy
, not
modal run
.

python
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))
注意: 定时函数仅在
modal deploy
命令下运行,
modal run
不会执行定时任务。

Parallel Processing

并行处理

Map

Map并行

python
undefined
python
undefined

Parallel execution (up to 1000 concurrent)

并行执行(最多1000并发)

results = list(func.map(items))
results = list(func.map(items))

Unordered (faster)

无序返回(速度更快)

results = list(func.map(items, order_outputs=False))
undefined
results = list(func.map(items, order_outputs=False))
undefined

Starmap

Starmap并行

python
undefined
python
undefined

Spread args

拆分参数

pairs = [(1, 2), (3, 4)] results = list(add.starmap(pairs))
undefined
pairs = [(1, 2), (3, 4)] results = list(add.starmap(pairs))
undefined

Spawn

Spawn异步任务

python
undefined
python
undefined

Async job (returns immediately)

异步任务(立即返回)

call = func.spawn(data) result = call.get() # Get result later
call = func.spawn(data) result = call.get() # 后续获取结果

Spawn many

批量创建异步任务

calls = [func.spawn(item) for item in items] results = [call.get() for call in calls]

---
calls = [func.spawn(item) for item in items] results = [call.get() for call in calls]

---

Container Lifecycle (Classes)

容器生命周期(类模式)

python
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:

    @modal.enter()
    def load(self):
        self.model = load_model()

    @modal.method()
    def predict(self, text):
        return self.model(text)

    @modal.exit()
    def cleanup(self):
        del self.model
python
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:

    @modal.enter()
    def load(self):
        self.model = load_model()

    @modal.method()
    def predict(self, text):
        return self.model(text)

    @modal.exit()
    def cleanup(self):
        del self.model

Concurrency

并发配置

python
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
    pass

python
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
    pass

CLI Commands

CLI命令

Development

开发相关

bash
modal run app.py              # Run function
modal serve app.py            # Hot-reload dev server
modal shell app.py            # Interactive shell
modal shell app.py --gpu A100 # Shell with GPU
bash
modal run app.py              # 运行函数
modal serve app.py            # 热重载开发服务器
modal shell app.py            # 交互式Shell
modal shell app.py --gpu A100 # 带GPU的Shell

Deployment

部署相关

bash
modal deploy app.py           # Deploy
modal app list                # List apps
modal app logs app-name       # View logs
modal app stop app-name       # Stop app
bash
modal deploy app.py           # 部署应用
modal app list                # 列出应用
modal app logs app-name       # 查看应用日志
modal app stop app-name       # 停止应用

Resources

资源管理

bash
undefined
bash
undefined

Volumes

卷管理

modal volume create name modal volume list modal volume put name local remote modal volume get name remote local
modal volume create name modal volume list modal volume put name local remote modal volume get name remote local

Secrets

密钥管理

modal secret create name KEY=value modal secret list
modal secret create name KEY=value modal secret list

Environments

环境管理

modal environment create staging

---
modal environment create staging

---

Pricing (2025)

定价(2025年)

Plans

套餐方案

PlanPriceContainersGPU Concurrency
StarterFree ($30 credits)10010
Team$250/month100050
EnterpriseCustomUnlimitedCustom
套餐价格容器数量GPU并发数
入门版免费(含30美元额度)10010
团队版250美元/月100050
企业版定制价格无限制定制

Compute

计算资源定价

  • CPU: $0.0000131/core/sec
  • Memory: $0.00000222/GiB/sec
  • GPUs: See GPU table above
  • CPU:$0.0000131/核心/秒
  • 内存:$0.00000222/ GiB/秒
  • GPU:见上方GPU参考表格

Special Programs

特殊计划

  • Startups: Up to $25k credits
  • Researchers: Up to $10k credits

  • 创业公司:最高可获25000美元额度
  • 研究人员:最高可获10000美元额度

Best Practices

最佳实践

  1. Use
    @modal.enter()
    for model loading
  2. Use
    uv_pip_install
    for faster builds
  3. Use GPU fallbacks for availability
  4. Set appropriate timeouts and retries
  5. Use environments (dev/staging/prod)
  6. Download models during build, not runtime
  7. Use
    order_outputs=False
    when order doesn't matter
  8. Set
    container_idle_timeout
    to balance cost/latency
  9. Monitor costs in Modal dashboard
  10. Test with
    modal run
    before
    modal deploy

  1. 使用
    @modal.enter()
    加载模型
  2. 使用
    uv_pip_install
    加快镜像构建速度
  3. 配置GPU备选方案 提升可用性
  4. 设置合适的超时时间与重试次数
  5. 使用多环境(开发/预发布/生产)
  6. 在构建阶段下载模型,而非运行时
  7. 当结果顺序无关时,启用
    order_outputs=False
  8. 设置
    container_idle_timeout
    平衡成本与延迟
  9. 在Modal控制台监控成本
  10. 部署前使用
    modal run
    测试

Common Patterns

常见使用模式

LLM Inference

大语言模型推理

python
@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
    @modal.enter()
    def load(self):
        from vllm import LLM
        self.llm = LLM(model="...")

    @modal.method()
    def generate(self, prompt):
        return self.llm.generate([prompt])
python
@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
    @modal.enter()
    def load(self):
        from vllm import LLM
        self.llm = LLM(model="...")

    @modal.method()
    def generate(self, prompt):
        return self.llm.generate([prompt])

Batch Processing

批处理

python
@app.function(volumes={"/data": vol})
def process(file):
    # Process file
    vol.commit()
python
@app.function(volumes={"/data": vol})
def process(file):
    # 处理文件
    vol.commit()

Parallel

并行处理

results = list(process.map(files))
undefined
results = list(process.map(files))
undefined

Scheduled ETL

定时ETL任务

python
@app.function(
    schedule=modal.Cron("0 6 * * *"),
    secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
    extract()
    transform()
    load()

python
@app.function(
    schedule=modal.Cron("0 6 * * *"),
    secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
    extract()
    transform()
    load()

Quick Reference

快速参考

TaskCode
Create app
app = modal.App("name")
Basic function
@app.function()
With GPU
@app.function(gpu="A100")
With image
@app.function(image=img)
Web endpoint
@modal.asgi_app()
Scheduled
schedule=modal.Cron("...")
Mount volume
volumes={"/path": vol}
Use secret
secrets=[modal.Secret.from_name("x")]
Parallel map
func.map(items)
Async spawn
func.spawn(arg)
Class pattern
@app.cls()
with
@modal.enter()
任务代码
创建应用
app = modal.App("name")
基础函数
@app.function()
配置GPU
@app.function(gpu="A100")
配置镜像
@app.function(image=img)
Web端点
@modal.asgi_app()
定时任务
schedule=modal.Cron("...")
挂载卷
volumes={"/path": vol}
使用密钥
secrets=[modal.Secret.from_name("x")]
并行Map
func.map(items)
异步Spawn
func.spawn(arg)
类模式
@app.cls()
搭配
@modal.enter()