turbo-pipelines
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTurbo Pipeline Configuration Reference
Turbo管道配置参考
YAML configuration reference for Turbo pipelines. This is a lookup reference — for interactive pipeline building, use . For pipeline troubleshooting, use .
/turbo-builder/turbo-doctorCRITICAL: Always validate YAML withbefore showing complete pipeline YAML to the user or deploying.goldsky turbo validate <file.yaml>
这是Turbo管道的YAML配置参考文档,用于查询查阅——如需交互式管道构建,请使用;如需管道故障排查,请使用。
/turbo-builder/turbo-doctor重要提示: 在向用户展示完整管道YAML或部署之前,请务必使用验证YAML文件。goldsky turbo validate <file.yaml>
Quick Start
快速开始
Deploy a minimal pipeline:
yaml
name: my-first-pipeline
resource_size: s
sources:
transfers:
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: latest
transforms: {}
sinks:
output:
type: blackhole
from: transfersbash
undefined部署最小化管道:
yaml
name: my-first-pipeline
resource_size: s
sources:
transfers:
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: latest
transforms: {}
sinks:
output:
type: blackhole
from: transfersbash
undefinedValidate first:
先验证:
goldsky turbo validate pipeline.yaml
goldsky turbo validate pipeline.yaml
Then deploy:
然后部署:
goldsky turbo apply pipeline.yaml -i
---goldsky turbo apply pipeline.yaml -i
---Prerequisites
前置条件
- Goldsky CLI installed -
curl https://goldsky.com | sh - Turbo CLI extension installed (SEPARATE binary!) -
curl https://install-turbo.goldsky.com | sh- Note: Run - if you see "The turbo binary is not installed", install it first
goldsky turbo list
- Note: Run
- Logged in -
goldsky login - Pipeline YAML file ready
- Secrets created for sinks (if using PostgreSQL, ClickHouse, Kafka, etc.)
- 已安装Goldsky CLI -
curl https://goldsky.com | sh - 已安装Turbo CLI扩展(独立二进制文件!)-
curl https://install-turbo.goldsky.com | sh- 注意:运行——如果显示"The turbo binary is not installed",请先安装它
goldsky turbo list
- 注意:运行
- 已登录 -
goldsky login - 准备好管道YAML文件
- 为接收器创建了密钥(如果使用PostgreSQL、ClickHouse、Kafka等)
Discovering Available Data Sources
发现可用数据源
For dataset discovery, invoke the skill.
datasetsQuick reference for common datasets:
| What They Want | Dataset to Use |
|---|---|
| Token transfers (fungible) | |
| NFT transfers | |
| All contract events | |
| Block data | |
| Transaction data | |
For full chain prefixes, dataset types, and version discovery, use .
/datasets如需发现可用数据集,请调用技能。
datasets常见数据集速查:
| 需求 | 使用的数据集 |
|---|---|
| 同质化代币转账 | |
| NFT转账 | |
| 所有合约事件 | |
| 区块数据 | |
| 交易数据 | |
如需完整的链前缀、数据集类型及版本信息,请使用。
/datasetsQuick Reference
速查参考
Installation Commands
安装命令
| Action | Command |
|---|---|
| Install Goldsky CLI | |
| Install Turbo extension | |
| Verify Turbo installed | |
| 操作 | 命令 |
|---|---|
| 安装Goldsky CLI | |
| 安装Turbo扩展 | |
| 验证Turbo是否安装 | |
Pipeline Commands
管道命令
| Action | Command |
|---|---|
| List datasets | |
| Validate (REQUIRED) | |
| Deploy/Update | |
| Deploy + Inspect | |
| List pipelines | |
| View live data | |
| Inspect node | |
| View logs | |
| Follow logs | |
| List secrets | |
For pause, resume, restart, and delete commands, see .
/turbo-lifecycle| 操作 | 命令 |
|---|---|
| 列出数据集 | |
| 验证(必填) | |
| 部署/更新管道 | |
| 部署并检查 | |
| 列出管道 | |
| 查看实时数据 | |
| 检查节点 | |
| 查看日志 | |
| 实时跟踪日志 | |
| 列出密钥 | |
如需暂停、恢复、重启和删除命令,请查看。
/turbo-lifecycleConfiguration Reference
配置参考
Pipeline Structure
管道结构
Every Turbo pipeline YAML has this structure:
yaml
name: my-pipeline # Required: unique identifier
resource_size: s # Required: s, m, or l
description: "Optional desc" # Optional: what the pipeline does
sources:
source_name: # Define data inputs
type: dataset
# ... source config
transforms: # Optional: process data
transform_name:
type: sql
# ... transform config
sinks:
sink_name: # Define data outputs
type: postgres
# ... sink config每个Turbo管道YAML都遵循以下结构:
yaml
name: my-pipeline # 必填:唯一标识符
resource_size: s # 必填:s、m或l
description: "可选描述" # 可选:管道的功能说明
sources:
source_name: # 定义数据输入
type: dataset
# ... 数据源配置
transforms: # 可选:数据处理
transform_name:
type: sql
# ... 转换配置
sinks:
sink_name: # 定义数据输出
type: postgres
# ... 接收器配置Top-Level Fields
顶层字段
| Field | Required | Description |
|---|---|---|
| Yes | Unique pipeline identifier (lowercase, hyphens) |
| Yes | Worker allocation: |
| No | Human-readable description |
| No | |
| Yes | Data input definitions |
| No | Data processing definitions |
| Yes | Data output definitions |
| 字段 | 是否必填 | 描述 |
|---|---|---|
| 是 | 唯一管道标识符(小写,仅允许连字符) |
| 是 | 工作节点资源分配: |
| 否 | 人类可读的管道描述 |
| 否 | 设置为 |
| 是 | 数据输入定义 |
| 否 | 数据处理定义 |
| 是 | 数据输出定义 |
Job Mode
任务模式
Set for one-time batch processing (historical backfills, data exports):
job: trueyaml
name: backfill-usdc-history
resource_size: l
job: true
sources:
logs:
type: dataset
dataset_name: ethereum.raw_logs
version: 1.0.0
start_at: earliest
end_block: 19000000
filter: >-
address = '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48'
transforms: {}
sinks:
output:
type: s3_sink
from: logs
endpoint: https://s3.amazonaws.com
bucket: my-backfill-bucket
prefix: usdc/
secret_name: MY_S3Job mode rules:
- Runs to completion and auto-cleans up ~1 hour after finishing
- Must before redeploying — cannot update in-place
goldsky turbo delete - Cannot use — use delete + apply instead
restart - Use to bound the range (otherwise processes to chain tip and stops)
end_block - Best with for faster backfills
resource_size: l
For architecture guidance on when to use job vs streaming mode, see./turbo-architecture
设置用于一次性批处理(历史数据回填、数据导出):
job: trueyaml
name: backfill-usdc-history
resource_size: l
job: true
sources:
logs:
type: dataset
dataset_name: ethereum.raw_logs
version: 1.0.0
start_at: earliest
end_block: 19000000
filter: >-
address = '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48'
transforms: {}
sinks:
output:
type: s3_sink
from: logs
endpoint: https://s3.amazonaws.com
bucket: my-backfill-bucket
prefix: usdc/
secret_name: MY_S3任务模式规则:
- 运行完成后,会在约1小时后自动清理资源
- 重新部署前必须执行——不支持原地更新
goldsky turbo delete - 不支持命令——请先删除再重新部署
restart - 使用来限制处理范围(否则会处理到最新区块后停止)
end_block - 建议使用以加快回填速度
resource_size: l
如需了解何时使用任务模式 vs 流式模式的架构指导,请查看。/turbo-architecture
Resource Sizes
资源规格
| Size | Workers | Use Case |
|---|---|---|
| 1 | Testing, low-volume data |
| 2 | Production, moderate volume |
| 4 | High-volume, multi-chain pipelines |
| 规格 | 工作节点数量 | 使用场景 |
|---|---|---|
| 1 | 测试、低数据量场景 |
| 2 | 生产环境、中等数据量场景 |
| 4 | 高数据量、跨链管道场景 |
Source Configuration
数据源配置
Dataset Source
数据集数据源
yaml
sources:
my_source:
type: dataset
dataset_name: <chain>.<dataset_type>
version: <version>
start_at: latest | earliest # EVM chains
# OR
start_block: <slot_number> # Solana onlyyaml
sources:
my_source:
type: dataset
dataset_name: <chain>.<dataset_type>
version: <version>
start_at: latest | earliest # EVM链适用
# 或者
start_block: <slot_number> # 仅Solana适用Source Fields
数据源字段
| Field | Required | Description |
|---|---|---|
| Yes | |
| Yes | Format: |
| Yes | Dataset version (e.g., |
| EVM | |
| Solana | Specific slot number (omit for latest) |
| No | Stop processing at this block (for bounded backfills) |
| No | SQL WHERE clause to pre-filter at source level (efficient) |
| 字段 | 是否必填 | 描述 |
|---|---|---|
| 是 | 对于区块链数据,值为 |
| 是 | 格式: |
| 是 | 数据集版本(例如: |
| EVM链必填 | |
| Solana必填 | 特定插槽号(省略则从最新区块开始) |
| 否 | 处理到该区块后停止(用于限定范围的回填任务) |
| 否 | SQL WHERE子句,用于在数据源层面预过滤数据(效率更高) |
Source-Level Filtering
数据源层面过滤
Use to reduce data volume before it reaches transforms. This is significantly more efficient than filtering in SQL transforms because it eliminates data at the ingestion layer:
filteryaml
sources:
usdc_logs:
type: dataset
dataset_name: base.raw_logs
version: 1.0.0
start_at: earliest
filter: >-
address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')
AND block_number >= 10000000Best practices:
- Use for contract addresses and block ranges (coarse pre-filtering)
filter - Use transform for event types, parameter values, exclusions (fine-grained)
WHERE - uses standard SQL WHERE syntax (same as DataFusion)
filter - Combine with
filter+start_at: earliestfor precise bounded backfillsend_block
使用在数据到达转换层之前减少数据量,这比在SQL转换中过滤效率高得多,因为它在数据摄入阶段就剔除了不必要的数据:
filteryaml
sources:
usdc_logs:
type: dataset
dataset_name: base.raw_logs
version: 1.0.0
start_at: earliest
filter: >-
address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')
AND block_number >= 10000000最佳实践:
- 使用过滤合约地址和区块范围(粗粒度预过滤)
filter - 使用转换中的子句过滤事件类型、参数值、排除项(细粒度过滤)
WHERE - 使用标准SQL WHERE语法(与DataFusion一致)
filter - 结合与
filter+start_at: earliest实现精准的限定范围回填end_block
Chains and Dataset Types
链与数据集类型
For the full list of chains, prefixes, and dataset types, see . Key points:
/datasets- EVM chains: ,
ethereum,base(Polygon — notmatic),polygon,arbitrum,optimism,bscavalanche - Non-EVM: (uses
solananotstart_block),start_at,bitcoin.raw,stellar_mainnet,sui,near,starknetfogo - EVM dataset types: ,
raw_logs(notraw_transactions),transactions,blocks,raw_traces,erc20_transfers,erc721_transfersdecoded_logs
如需完整的链列表、前缀及数据集类型,请查看。重点说明:
/datasets- EVM链: 、
ethereum、base(Polygon——不要用matic)、polygon、arbitrum、optimism、bscavalanche - 非EVM链: (使用
solana而非start_block)、start_at、bitcoin.raw、stellar_mainnet、sui、near、starknetfogo - EVM数据集类型: 、
raw_logs(不是raw_transactions)、transactions、blocks、raw_traces、erc20_transfers、erc721_transfersdecoded_logs
Transform Configuration
转换配置
Transform Types
转换类型
| Type | Use Case |
|---|---|
| Filtering, projections, SQL functions |
| Custom TypeScript/WASM logic |
| Call external HTTP APIs to enrich data |
| Lookup tables backed by a database |
| 类型 | 使用场景 |
|---|---|
| 过滤、投影、SQL函数处理 |
| 自定义TypeScript/WASM逻辑 |
| 调用外部HTTP API丰富数据 |
| 基于数据库的查找表 |
SQL Transform
SQL转换
Most common transform type:
yaml
transforms:
filtered:
type: sql
primary_key: id
sql: |
SELECT
id,
sender,
recipient,
amount
FROM source_name
WHERE amount > 1000| Field | Required | Description |
|---|---|---|
| Yes | |
| Yes | Column for uniqueness/ordering |
| Yes | SQL query (reference sources by name) |
| No | Override default source (for chaining) |
最常用的转换类型:
yaml
transforms:
filtered:
type: sql
primary_key: id
sql: |
SELECT
id,
sender,
recipient,
amount
FROM source_name
WHERE amount > 1000| 字段 | 是否必填 | 描述 |
|---|---|---|
| 是 | 值为 |
| 是 | 用于唯一性/排序的列 |
| 是 | SQL查询(通过名称引用数据源) |
| 否 | 覆盖默认数据源(用于链式转换) |
TypeScript Transform
TypeScript转换
For complex logic that SQL can't handle (runs in WASM sandbox):
yaml
transforms:
custom:
type: script
primary_key: id
language: typescript
from: source_name
schema:
id: string
sender: string
amount: string
processed_at: string
script: |
function invoke(data) {
if (data.amount < 1000) return null; // Filter out
return {
id: data.id,
sender: data.sender,
amount: data.amount,
processed_at: new Date().toISOString()
};
}For full TypeScript transform documentation, schema types, and examples, see./turbo-transforms
用于SQL无法处理的复杂逻辑(在WASM沙箱中运行):
yaml
transforms:
custom:
type: script
primary_key: id
language: typescript
from: source_name
schema:
id: string
sender: string
amount: string
processed_at: string
script: |
function invoke(data) {
if (data.amount < 1000) return null; // 过滤掉
return {
id: data.id,
sender: data.sender,
amount: data.amount,
processed_at: new Date().toISOString()
};
}如需完整的TypeScript转换文档、 schema类型及示例,请查看。/turbo-transforms
Dynamic Table Transform
动态表转换
Updatable lookup tables for runtime filtering (allowlists, blocklists, enrichment):
yaml
transforms:
tracked_wallets:
type: dynamic_table
backend_type: Postgres # or: InMemory
backend_entity_name: tracked_wallets
secret_name: MY_DB # required for PostgresUse with in SQL transforms:
dynamic_table_check()sql
WHERE dynamic_table_check('tracked_wallets', sender)For full dynamic table documentation, backend options, and examples, see./turbo-transforms
用于运行时过滤的可更新查找表(白名单、黑名单、数据丰富):
yaml
transforms:
tracked_wallets:
type: dynamic_table
backend_type: Postgres # 或:InMemory
backend_entity_name: tracked_wallets
secret_name: MY_DB # 使用Postgres时必填在SQL转换中结合使用:
dynamic_table_check()sql
WHERE dynamic_table_check('tracked_wallets', sender)如需完整的动态表文档、后端选项及示例,请查看。/turbo-transforms
Handler Transform
Handler转换
Call external HTTP APIs to enrich data:
yaml
transforms:
enriched:
type: handler
primary_key: id
from: my_source
url: https://my-api.example.com/enrich
headers:
Authorization: Bearer my-token
batch_size: 100
timeout_ms: 5000For full handler transform documentation, see./turbo-transforms
调用外部HTTP API丰富数据:
yaml
transforms:
enriched:
type: handler
primary_key: id
from: my_source
url: https://my-api.example.com/enrich
headers:
Authorization: Bearer my-token
batch_size: 100
timeout_ms: 5000如需完整的Handler转换文档,请查看。/turbo-transforms
Transform Chaining
链式转换
Chain transforms using :
fromyaml
transforms:
step1:
type: sql
primary_key: id
sql: SELECT * FROM source WHERE amount > 100
step2:
type: sql
primary_key: id
from: step1
sql: SELECT *, 'processed' as status FROM step1使用实现链式转换:
fromyaml
transforms:
step1:
type: sql
primary_key: id
sql: SELECT * FROM source WHERE amount > 100
step2:
type: sql
primary_key: id
from: step1
sql: SELECT *, 'processed' as status FROM step1Sink Configuration
接收器配置
Common Sink Fields
通用接收器字段
| Field | Required | Description |
|---|---|---|
| Yes | Sink type |
| Yes | Source or transform to read from |
| Varies | Secret for credentials (most sinks) |
| Varies | Column for upserts (database sinks) |
| 字段 | 是否必填 | 描述 |
|---|---|---|
| 是 | 接收器类型 |
| 是 | 读取数据的数据源或转换节点 |
| 视情况而定 | 凭证密钥(大多数接收器需要) |
| 视情况而定 | 用于更新的列(数据库接收器) |
Blackhole Sink (Testing)
Blackhole接收器(测试用)
yaml
sinks:
test_output:
type: blackhole
from: my_transformyaml
sinks:
test_output:
type: blackhole
from: my_transformPostgreSQL Sink
PostgreSQL接收器
yaml
sinks:
postgres_output:
type: postgres
from: my_transform
schema: public
table: my_table
secret_name: MY_POSTGRES_SECRET
primary_key: idSecret format: PostgreSQL connection string:
postgres://username:password@host:port/databaseyaml
sinks:
postgres_output:
type: postgres
from: my_transform
schema: public
table: my_table
secret_name: MY_POSTGRES_SECRET
primary_key: id密钥格式: PostgreSQL连接字符串:
postgres://username:password@host:port/databasePostgreSQL Aggregate Sink
PostgreSQL聚合接收器
Real-time aggregations in PostgreSQL using database triggers. Data flows into a landing table, and a trigger maintains aggregated values in a separate table.
yaml
sinks:
balances:
type: postgres_aggregate
from: transfers
schema: public
landing_table: transfer_log
agg_table: account_balances
primary_key: transfer_id
secret_name: MY_POSTGRES
group_by:
account:
type: text
aggregate:
balance:
from: amount
fn: sumSupported aggregation functions: , , , ,
sumcountavgminmax使用数据库触发器在PostgreSQL中实现实时聚合。数据流入落地表,触发器在单独的表中维护聚合值。
yaml
sinks:
balances:
type: postgres_aggregate
from: transfers
schema: public
landing_table: transfer_log
agg_table: account_balances
primary_key: transfer_id
secret_name: MY_POSTGRES
group_by:
account:
type: text
aggregate:
balance:
from: amount
fn: sum支持的聚合函数:、、、、
sumcountavgminmaxClickHouse Sink
ClickHouse接收器
yaml
sinks:
clickhouse_output:
type: clickhouse
from: my_transform
table: my_table
secret_name: MY_CLICKHOUSE_SECRET
primary_key: idSecret format: ClickHouse connection string:
https://username:password@host:port/databaseyaml
sinks:
clickhouse_output:
type: clickhouse
from: my_transform
table: my_table
secret_name: MY_CLICKHOUSE_SECRET
primary_key: id密钥格式: ClickHouse连接字符串:
https://username:password@host:port/databaseKafka Sink
Kafka接收器
yaml
sinks:
kafka_output:
type: kafka
from: my_transform
topic: my-topic
topic_partitions: 10
data_format: avro # or: json
schema_registry_url: http://schema-registry:8081 # required for avroyaml
sinks:
kafka_output:
type: kafka
from: my_transform
topic: my-topic
topic_partitions: 10
data_format: avro # 或:json
schema_registry_url: http://schema-registry:8081 # 使用avro时必填Webhook Sink
Webhook接收器
Note: Turbo webhook sinks do not support Goldsky's native secrets management. Include auth headers directly in the pipeline config.
yaml
sinks:
webhook_output:
type: webhook
from: my_transform
url: https://api.example.com/webhook
one_row_per_request: true
headers:
Authorization: Bearer your-token
Content-Type: application/json注意:Turbo webhook接收器不支持Goldsky原生密钥管理。请在管道配置中直接包含认证头。
yaml
sinks:
webhook_output:
type: webhook
from: my_transform
url: https://api.example.com/webhook
one_row_per_request: true
headers:
Authorization: Bearer your-token
Content-Type: application/jsonS3 Sink
S3接收器
yaml
sinks:
s3_output:
type: s3_sink
from: my_transform
endpoint: https://s3.amazonaws.com
bucket: my-bucket
prefix: data/
secret_name: MY_S3_SECRETSecret format: (or for temporary credentials)
access_key_id:secret_access_keyaccess_key_id:secret_access_key:session_tokenyaml
sinks:
s3_output:
type: s3_sink
from: my_transform
endpoint: https://s3.amazonaws.com
bucket: my-bucket
prefix: data/
secret_name: MY_S3_SECRET密钥格式: (临时凭证格式:)
access_key_id:secret_access_keyaccess_key_id:secret_access_key:session_tokenS2 Sink
S2接收器
Publish to S2.dev streams — a serverless alternative to Kafka.
yaml
sinks:
s2_output:
type: s2_sink
from: my_transform
access_token: your_access_token
basin: your-basin-name
stream: your-stream-name发布到S2.dev流——Kafka的无服务器替代方案。
yaml
sinks:
s2_output:
type: s2_sink
from: my_transform
access_token: your_access_token
basin: your-basin-name
stream: your-stream-nameStarter Templates
入门模板
Template files are available in thefolder. Copy and customize these for your pipelines.templates/
| Template | Description | Use Case |
|---|---|---|
| Simplest pipeline, no credentials | Quick testing |
| Filter by contract address | USDC, specific tokens |
| Write to PostgreSQL | Production data storage |
| Combine multiple chains | Cross-chain analytics |
| Solana SPL tokens | Non-EVM chains |
| Multiple outputs | Archive + alerts + streaming |
To use a template:
bash
undefined模板文件位于文件夹中。 复制并自定义这些模板以创建你的管道。templates/
| 模板名称 | 描述 | 使用场景 |
|---|---|---|
| 最简单的管道,无需凭证 | 快速测试 |
| 按合约地址过滤 | USDC、特定代币场景 |
| 写入PostgreSQL | 生产环境数据存储 |
| 合并多条链的数据 | 跨链分析 |
| Solana SPL代币 | 非EVM链场景 |
| 多输出节点 | 归档+告警+流处理 |
使用模板的步骤:
bash
undefinedCopy template to your project
复制模板到你的项目
cp templates/minimal-erc20-blackhole.yaml my-pipeline.yaml
cp templates/minimal-erc20-blackhole.yaml my-pipeline.yaml
Customize as needed, then validate
根据需要自定义,然后验证
goldsky turbo validate my-pipeline.yaml
goldsky turbo validate my-pipeline.yaml
Deploy
部署
goldsky turbo apply my-pipeline.yaml -i
**Template location:** `templates/` (relative to this skill's directory)
---goldsky turbo apply my-pipeline.yaml -i
**模板位置:** `templates/`(相对于本技能的目录)
---Common Update Patterns
常见更新模式
Adding a SQL Transform
添加SQL转换
Before:
yaml
transforms: {}
sinks:
output:
type: blackhole
from: transfersAfter:
yaml
transforms:
filtered:
type: sql
primary_key: id
sql: |
SELECT * FROM transfers WHERE amount > 1000000
sinks:
output:
type: blackhole
from: filtered # Changed from 'transfers'更新前:
yaml
transforms: {}
sinks:
output:
type: blackhole
from: transfers更新后:
yaml
transforms:
filtered:
type: sql
primary_key: id
sql: |
SELECT * FROM transfers WHERE amount > 1000000
sinks:
output:
type: blackhole
from: filtered # 从'transfers'改为'filtered'Adding a PostgreSQL Sink
添加PostgreSQL接收器
yaml
sinks:
existing_sink:
type: blackhole
from: my_transform
# Add new sink
postgres_output:
type: postgres
from: my_transform
schema: public
table: my_data
secret_name: MY_POSTGRES_SECRET
primary_key: idyaml
sinks:
existing_sink:
type: blackhole
from: my_transform
# 添加新接收器
postgres_output:
type: postgres
from: my_transform
schema: public
table: my_data
secret_name: MY_POSTGRES_SECRET
primary_key: idChanging Resource Size
修改资源规格
yaml
resource_size: m # was: syaml
resource_size: m # 之前是:sAdding a New Source
添加新数据源
yaml
sources:
eth_transfers:
type: dataset
dataset_name: ethereum.erc20_transfers
version: 1.0.0
start_at: latest
# Add new source
base_transfers:
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: latestyaml
sources:
eth_transfers:
type: dataset
dataset_name: ethereum.erc20_transfers
version: 1.0.0
start_at: latest
# 添加新数据源
base_transfers:
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: latestCheckpoint Behavior
检查点行为
Understanding Checkpoints
理解检查点
When you update a pipeline:
- Checkpoints are preserved by default - Processing continues from where it left off
- Source checkpoints are tied to source names - Renaming a source resets its checkpoint
- Pipeline checkpoints are tied to pipeline names - Renaming the pipeline resets all checkpoints
当你更新管道时:
- 默认保留检查点——处理从上次中断的位置继续
- 数据源检查点与数据源名称绑定——重命名数据源会重置其检查点
- 管道检查点与管道名称绑定——重命名管道会重置所有检查点
Resetting Checkpoints
重置检查点
Option 1: Rename the source
yaml
sources:
transfers_v2: # Changed from 'transfers'
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: earliest # Will process from beginningOption 2: Rename the pipeline
yaml
name: my-pipeline-v2 # Changed from 'my-pipeline'Warning: Resetting checkpoints means reprocessing all historical data.
选项1:重命名数据源
yaml
sources:
transfers_v2: # 从'transfers'修改而来
type: dataset
dataset_name: base.erc20_transfers
version: 1.2.0
start_at: earliest # 将从最开始处理选项2:重命名管道
yaml
name: my-pipeline-v2 # 从'my-pipeline'修改而来警告: 重置检查点意味着需要重新处理所有历史数据。
Troubleshooting
故障排查
See for:
references/troubleshooting.md- CLI hanging / Turbo binary not found fixes
- Common validation errors (unknown dataset, missing primary_key, bad source reference)
- Common runtime errors (auth failed, connection refused, Neon size limit)
- Quick troubleshooting table
Also see for error patterns and log analysis.
/turbo-monitor-debug请查看获取以下内容:
references/troubleshooting.md- CLI卡住/Turbo二进制文件未找到的解决方法
- 常见验证错误(未知数据集、缺失primary_key、无效数据源引用)
- 常见运行时错误(认证失败、连接拒绝、Neon大小限制)
- 快速故障排查表
如需错误模式和日志分析,请查看。
/turbo-monitor-debugRelated
相关资源
- — Interactive wizard to build pipelines step-by-step
/turbo-builder - — Diagnose and fix pipeline issues
/turbo-doctor - — Dataset names and chain prefixes
/datasets
- —— 交互式向导,分步构建管道
/turbo-builder - —— 诊断并修复管道问题
/turbo-doctor - —— 数据集名称和链前缀
/datasets