turbo-pipelines

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Turbo Pipeline Configuration Reference

Turbo管道配置参考

YAML configuration reference for Turbo pipelines. This is a lookup reference — for interactive pipeline building, use
/turbo-builder
. For pipeline troubleshooting, use
/turbo-doctor
.
CRITICAL: Always validate YAML with
goldsky turbo validate <file.yaml>
before showing complete pipeline YAML to the user or deploying.

这是Turbo管道的YAML配置参考文档,用于查询查阅——如需交互式管道构建,请使用
/turbo-builder
;如需管道故障排查,请使用
/turbo-doctor
重要提示: 在向用户展示完整管道YAML或部署之前,请务必使用
goldsky turbo validate <file.yaml>
验证YAML文件。

Quick Start

快速开始

Deploy a minimal pipeline:
yaml
name: my-first-pipeline
resource_size: s
sources:
  transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest
transforms: {}
sinks:
  output:
    type: blackhole
    from: transfers
bash
undefined
部署最小化管道:
yaml
name: my-first-pipeline
resource_size: s
sources:
  transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest
transforms: {}
sinks:
  output:
    type: blackhole
    from: transfers
bash
undefined

Validate first:

先验证:

goldsky turbo validate pipeline.yaml
goldsky turbo validate pipeline.yaml

Then deploy:

然后部署:

goldsky turbo apply pipeline.yaml -i

---
goldsky turbo apply pipeline.yaml -i

---

Prerequisites

前置条件

  • Goldsky CLI installed -
    curl https://goldsky.com | sh
  • Turbo CLI extension installed (SEPARATE binary!) -
    curl https://install-turbo.goldsky.com | sh
    • Note: Run
      goldsky turbo list
      - if you see "The turbo binary is not installed", install it first
  • Logged in -
    goldsky login
  • Pipeline YAML file ready
  • Secrets created for sinks (if using PostgreSQL, ClickHouse, Kafka, etc.)
  • 已安装Goldsky CLI -
    curl https://goldsky.com | sh
  • 已安装Turbo CLI扩展(独立二进制文件!)-
    curl https://install-turbo.goldsky.com | sh
    • 注意:运行
      goldsky turbo list
      ——如果显示"The turbo binary is not installed",请先安装它
  • 已登录 -
    goldsky login
  • 准备好管道YAML文件
  • 为接收器创建了密钥(如果使用PostgreSQL、ClickHouse、Kafka等)

Discovering Available Data Sources

发现可用数据源

For dataset discovery, invoke the
datasets
skill.
Quick reference for common datasets:
What They WantDataset to Use
Token transfers (fungible)
<chain>.erc20_transfers
NFT transfers
<chain>.erc721_transfers
All contract events
<chain>.logs
Block data
<chain>.blocks
Transaction data
<chain>.transactions
For full chain prefixes, dataset types, and version discovery, use
/datasets
.

如需发现可用数据集,请调用
datasets
技能。
常见数据集速查:
需求使用的数据集
同质化代币转账
<chain>.erc20_transfers
NFT转账
<chain>.erc721_transfers
所有合约事件
<chain>.logs
区块数据
<chain>.blocks
交易数据
<chain>.transactions
如需完整的链前缀、数据集类型及版本信息,请使用
/datasets

Quick Reference

速查参考

Installation Commands

安装命令

ActionCommand
Install Goldsky CLI
curl https://goldsky.com | sh
Install Turbo extension
curl https://install-turbo.goldsky.com | sh
Verify Turbo installed
goldsky turbo list
操作命令
安装Goldsky CLI
curl https://goldsky.com | sh
安装Turbo扩展
curl https://install-turbo.goldsky.com | sh
验证Turbo是否安装
goldsky turbo list

Pipeline Commands

管道命令

ActionCommand
List datasets
goldsky dataset list
⚠️ Slow (30-60s)
Validate (REQUIRED)
goldsky turbo validate pipeline.yaml
Fast (3s)
Deploy/Update
goldsky turbo apply pipeline.yaml
Deploy + Inspect
goldsky turbo apply pipeline.yaml -i
List pipelines
goldsky turbo list
View live data
goldsky turbo inspect <name>
Inspect node
goldsky turbo inspect <name> -n <node>
View logs
goldsky turbo logs <name>
Follow logs
goldsky turbo logs <name> --follow
List secrets
goldsky secret list
For pause, resume, restart, and delete commands, see
/turbo-lifecycle
.

操作命令
列出数据集
goldsky dataset list
⚠️ 速度慢(30-60秒)
验证(必填)
goldsky turbo validate pipeline.yaml
速度快(3秒)
部署/更新管道
goldsky turbo apply pipeline.yaml
部署并检查
goldsky turbo apply pipeline.yaml -i
列出管道
goldsky turbo list
查看实时数据
goldsky turbo inspect <name>
检查节点
goldsky turbo inspect <name> -n <node>
查看日志
goldsky turbo logs <name>
实时跟踪日志
goldsky turbo logs <name> --follow
列出密钥
goldsky secret list
如需暂停、恢复、重启和删除命令,请查看
/turbo-lifecycle

Configuration Reference

配置参考

Pipeline Structure

管道结构

Every Turbo pipeline YAML has this structure:
yaml
name: my-pipeline # Required: unique identifier
resource_size: s # Required: s, m, or l
description: "Optional desc" # Optional: what the pipeline does

sources:
  source_name: # Define data inputs
    type: dataset
    # ... source config

transforms: # Optional: process data
  transform_name:
    type: sql
    # ... transform config

sinks:
  sink_name: # Define data outputs
    type: postgres
    # ... sink config
每个Turbo管道YAML都遵循以下结构:
yaml
name: my-pipeline # 必填:唯一标识符
resource_size: s # 必填:s、m或l
description: "可选描述" # 可选:管道的功能说明

sources:
  source_name: # 定义数据输入
    type: dataset
    # ... 数据源配置

transforms: # 可选:数据处理
  transform_name:
    type: sql
    # ... 转换配置

sinks:
  sink_name: # 定义数据输出
    type: postgres
    # ... 接收器配置

Top-Level Fields

顶层字段

FieldRequiredDescription
name
YesUnique pipeline identifier (lowercase, hyphens)
resource_size
YesWorker allocation:
s
,
m
, or
l
description
NoHuman-readable description
job
No
true
for one-time batch jobs (default:
false
= streaming)
sources
YesData input definitions
transforms
NoData processing definitions
sinks
YesData output definitions
字段是否必填描述
name
唯一管道标识符(小写,仅允许连字符)
resource_size
工作节点资源分配:
s
m
l
description
人类可读的管道描述
job
设置为
true
表示一次性批处理任务(默认:
false
= 流式处理)
sources
数据输入定义
transforms
数据处理定义
sinks
数据输出定义

Job Mode

任务模式

Set
job: true
for one-time batch processing (historical backfills, data exports):
yaml
name: backfill-usdc-history
resource_size: l
job: true

sources:
  logs:
    type: dataset
    dataset_name: ethereum.raw_logs
    version: 1.0.0
    start_at: earliest
    end_block: 19000000
    filter: >-
      address = '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48'
transforms: {}
sinks:
  output:
    type: s3_sink
    from: logs
    endpoint: https://s3.amazonaws.com
    bucket: my-backfill-bucket
    prefix: usdc/
    secret_name: MY_S3
Job mode rules:
  • Runs to completion and auto-cleans up ~1 hour after finishing
  • Must
    goldsky turbo delete
    before redeploying
    — cannot update in-place
  • Cannot use
    restart
    — use delete + apply instead
  • Use
    end_block
    to bound the range (otherwise processes to chain tip and stops)
  • Best with
    resource_size: l
    for faster backfills
For architecture guidance on when to use job vs streaming mode, see
/turbo-architecture
.
设置
job: true
用于一次性批处理(历史数据回填、数据导出):
yaml
name: backfill-usdc-history
resource_size: l
job: true

sources:
  logs:
    type: dataset
    dataset_name: ethereum.raw_logs
    version: 1.0.0
    start_at: earliest
    end_block: 19000000
    filter: >-
      address = '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48'
transforms: {}
sinks:
  output:
    type: s3_sink
    from: logs
    endpoint: https://s3.amazonaws.com
    bucket: my-backfill-bucket
    prefix: usdc/
    secret_name: MY_S3
任务模式规则:
  • 运行完成后,会在约1小时后自动清理资源
  • 重新部署前必须执行
    goldsky turbo delete
    ——不支持原地更新
  • 不支持
    restart
    命令
    ——请先删除再重新部署
  • 使用
    end_block
    来限制处理范围(否则会处理到最新区块后停止)
  • 建议使用
    resource_size: l
    以加快回填速度
如需了解何时使用任务模式 vs 流式模式的架构指导,请查看
/turbo-architecture

Resource Sizes

资源规格

SizeWorkersUse Case
s
1Testing, low-volume data
m
2Production, moderate volume
l
4High-volume, multi-chain pipelines

规格工作节点数量使用场景
s
1测试、低数据量场景
m
2生产环境、中等数据量场景
l
4高数据量、跨链管道场景

Source Configuration

数据源配置

Dataset Source

数据集数据源

yaml
sources:
  my_source:
    type: dataset
    dataset_name: <chain>.<dataset_type>
    version: <version>
    start_at: latest | earliest # EVM chains
    # OR
    start_block: <slot_number> # Solana only
yaml
sources:
  my_source:
    type: dataset
    dataset_name: <chain>.<dataset_type>
    version: <version>
    start_at: latest | earliest # EVM链适用
    # 或者
    start_block: <slot_number> # 仅Solana适用

Source Fields

数据源字段

FieldRequiredDescription
type
Yes
dataset
for blockchain data
dataset_name
YesFormat:
<chain>.<dataset_type>
version
YesDataset version (e.g.,
1.2.0
)
start_at
EVM
latest
or
earliest
start_block
SolanaSpecific slot number (omit for latest)
end_block
NoStop processing at this block (for bounded backfills)
filter
NoSQL WHERE clause to pre-filter at source level (efficient)
字段是否必填描述
type
对于区块链数据,值为
dataset
dataset_name
格式:
<chain>.<dataset_type>
version
数据集版本(例如:
1.2.0
start_at
EVM链必填
latest
earliest
start_block
Solana必填特定插槽号(省略则从最新区块开始)
end_block
处理到该区块后停止(用于限定范围的回填任务)
filter
SQL WHERE子句,用于在数据源层面预过滤数据(效率更高)

Source-Level Filtering

数据源层面过滤

Use
filter
to reduce data volume before it reaches transforms. This is significantly more efficient than filtering in SQL transforms because it eliminates data at the ingestion layer:
yaml
sources:
  usdc_logs:
    type: dataset
    dataset_name: base.raw_logs
    version: 1.0.0
    start_at: earliest
    filter: >-
      address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')
      AND block_number >= 10000000
Best practices:
  • Use
    filter
    for contract addresses and block ranges (coarse pre-filtering)
  • Use transform
    WHERE
    for event types, parameter values, exclusions (fine-grained)
  • filter
    uses standard SQL WHERE syntax (same as DataFusion)
  • Combine
    filter
    with
    start_at: earliest
    +
    end_block
    for precise bounded backfills
使用
filter
在数据到达转换层之前减少数据量,这比在SQL转换中过滤效率高得多,因为它在数据摄入阶段就剔除了不必要的数据:
yaml
sources:
  usdc_logs:
    type: dataset
    dataset_name: base.raw_logs
    version: 1.0.0
    start_at: earliest
    filter: >-
      address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')
      AND block_number >= 10000000
最佳实践:
  • 使用
    filter
    过滤合约地址和区块范围(粗粒度预过滤)
  • 使用转换中的
    WHERE
    子句过滤事件类型、参数值、排除项(细粒度过滤)
  • filter
    使用标准SQL WHERE语法(与DataFusion一致)
  • 结合
    filter
    start_at: earliest
    +
    end_block
    实现精准的限定范围回填

Chains and Dataset Types

链与数据集类型

For the full list of chains, prefixes, and dataset types, see
/datasets
. Key points:
  • EVM chains:
    ethereum
    ,
    base
    ,
    matic
    (Polygon — not
    polygon
    ),
    arbitrum
    ,
    optimism
    ,
    bsc
    ,
    avalanche
  • Non-EVM:
    solana
    (uses
    start_block
    not
    start_at
    ),
    bitcoin.raw
    ,
    stellar_mainnet
    ,
    sui
    ,
    near
    ,
    starknet
    ,
    fogo
  • EVM dataset types:
    raw_logs
    ,
    raw_transactions
    (not
    transactions
    ),
    blocks
    ,
    raw_traces
    ,
    erc20_transfers
    ,
    erc721_transfers
    ,
    decoded_logs

如需完整的链列表、前缀及数据集类型,请查看
/datasets
。重点说明:
  • EVM链:
    ethereum
    base
    matic
    (Polygon——不要用
    polygon
    )、
    arbitrum
    optimism
    bsc
    avalanche
  • 非EVM链:
    solana
    (使用
    start_block
    而非
    start_at
    )、
    bitcoin.raw
    stellar_mainnet
    sui
    near
    starknet
    fogo
  • EVM数据集类型:
    raw_logs
    raw_transactions
    不是
    transactions
    )、
    blocks
    raw_traces
    erc20_transfers
    erc721_transfers
    decoded_logs

Transform Configuration

转换配置

Transform Types

转换类型

TypeUse Case
sql
Filtering, projections, SQL functions
script
Custom TypeScript/WASM logic
handler
Call external HTTP APIs to enrich data
dynamic_table
Lookup tables backed by a database
类型使用场景
sql
过滤、投影、SQL函数处理
script
自定义TypeScript/WASM逻辑
handler
调用外部HTTP API丰富数据
dynamic_table
基于数据库的查找表

SQL Transform

SQL转换

Most common transform type:
yaml
transforms:
  filtered:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        sender,
        recipient,
        amount
      FROM source_name
      WHERE amount > 1000
FieldRequiredDescription
type
Yes
sql
primary_key
YesColumn for uniqueness/ordering
sql
YesSQL query (reference sources by name)
from
NoOverride default source (for chaining)
最常用的转换类型:
yaml
transforms:
  filtered:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        sender,
        recipient,
        amount
      FROM source_name
      WHERE amount > 1000
字段是否必填描述
type
值为
sql
primary_key
用于唯一性/排序的列
sql
SQL查询(通过名称引用数据源)
from
覆盖默认数据源(用于链式转换)

TypeScript Transform

TypeScript转换

For complex logic that SQL can't handle (runs in WASM sandbox):
yaml
transforms:
  custom:
    type: script
    primary_key: id
    language: typescript
    from: source_name
    schema:
      id: string
      sender: string
      amount: string
      processed_at: string
    script: |
      function invoke(data) {
        if (data.amount < 1000) return null;  // Filter out
        return {
          id: data.id,
          sender: data.sender,
          amount: data.amount,
          processed_at: new Date().toISOString()
        };
      }
For full TypeScript transform documentation, schema types, and examples, see
/turbo-transforms
.
用于SQL无法处理的复杂逻辑(在WASM沙箱中运行):
yaml
transforms:
  custom:
    type: script
    primary_key: id
    language: typescript
    from: source_name
    schema:
      id: string
      sender: string
      amount: string
      processed_at: string
    script: |
      function invoke(data) {
        if (data.amount < 1000) return null;  // 过滤掉
        return {
          id: data.id,
          sender: data.sender,
          amount: data.amount,
          processed_at: new Date().toISOString()
        };
      }
如需完整的TypeScript转换文档、 schema类型及示例,请查看
/turbo-transforms

Dynamic Table Transform

动态表转换

Updatable lookup tables for runtime filtering (allowlists, blocklists, enrichment):
yaml
transforms:
  tracked_wallets:
    type: dynamic_table
    backend_type: Postgres        # or: InMemory
    backend_entity_name: tracked_wallets
    secret_name: MY_DB            # required for Postgres
Use with
dynamic_table_check()
in SQL transforms:
sql
WHERE dynamic_table_check('tracked_wallets', sender)
For full dynamic table documentation, backend options, and examples, see
/turbo-transforms
.
用于运行时过滤的可更新查找表(白名单、黑名单、数据丰富):
yaml
transforms:
  tracked_wallets:
    type: dynamic_table
    backend_type: Postgres        # 或:InMemory
    backend_entity_name: tracked_wallets
    secret_name: MY_DB            # 使用Postgres时必填
在SQL转换中结合
dynamic_table_check()
使用:
sql
WHERE dynamic_table_check('tracked_wallets', sender)
如需完整的动态表文档、后端选项及示例,请查看
/turbo-transforms

Handler Transform

Handler转换

Call external HTTP APIs to enrich data:
yaml
transforms:
  enriched:
    type: handler
    primary_key: id
    from: my_source
    url: https://my-api.example.com/enrich
    headers:
      Authorization: Bearer my-token
    batch_size: 100
    timeout_ms: 5000
For full handler transform documentation, see
/turbo-transforms
.
调用外部HTTP API丰富数据:
yaml
transforms:
  enriched:
    type: handler
    primary_key: id
    from: my_source
    url: https://my-api.example.com/enrich
    headers:
      Authorization: Bearer my-token
    batch_size: 100
    timeout_ms: 5000
如需完整的Handler转换文档,请查看
/turbo-transforms

Transform Chaining

链式转换

Chain transforms using
from
:
yaml
transforms:
  step1:
    type: sql
    primary_key: id
    sql: SELECT * FROM source WHERE amount > 100

  step2:
    type: sql
    primary_key: id
    from: step1
    sql: SELECT *, 'processed' as status FROM step1

使用
from
实现链式转换:
yaml
transforms:
  step1:
    type: sql
    primary_key: id
    sql: SELECT * FROM source WHERE amount > 100

  step2:
    type: sql
    primary_key: id
    from: step1
    sql: SELECT *, 'processed' as status FROM step1

Sink Configuration

接收器配置

Common Sink Fields

通用接收器字段

FieldRequiredDescription
type
YesSink type
from
YesSource or transform to read from
secret_name
VariesSecret for credentials (most sinks)
primary_key
VariesColumn for upserts (database sinks)
字段是否必填描述
type
接收器类型
from
读取数据的数据源或转换节点
secret_name
视情况而定凭证密钥(大多数接收器需要)
primary_key
视情况而定用于更新的列(数据库接收器)

Blackhole Sink (Testing)

Blackhole接收器(测试用)

yaml
sinks:
  test_output:
    type: blackhole
    from: my_transform
yaml
sinks:
  test_output:
    type: blackhole
    from: my_transform

PostgreSQL Sink

PostgreSQL接收器

yaml
sinks:
  postgres_output:
    type: postgres
    from: my_transform
    schema: public
    table: my_table
    secret_name: MY_POSTGRES_SECRET
    primary_key: id
Secret format: PostgreSQL connection string:
postgres://username:password@host:port/database
yaml
sinks:
  postgres_output:
    type: postgres
    from: my_transform
    schema: public
    table: my_table
    secret_name: MY_POSTGRES_SECRET
    primary_key: id
密钥格式: PostgreSQL连接字符串:
postgres://username:password@host:port/database

PostgreSQL Aggregate Sink

PostgreSQL聚合接收器

Real-time aggregations in PostgreSQL using database triggers. Data flows into a landing table, and a trigger maintains aggregated values in a separate table.
yaml
sinks:
  balances:
    type: postgres_aggregate
    from: transfers
    schema: public
    landing_table: transfer_log
    agg_table: account_balances
    primary_key: transfer_id
    secret_name: MY_POSTGRES
    group_by:
      account:
        type: text
    aggregate:
      balance:
        from: amount
        fn: sum
Supported aggregation functions:
sum
,
count
,
avg
,
min
,
max
使用数据库触发器在PostgreSQL中实现实时聚合。数据流入落地表,触发器在单独的表中维护聚合值。
yaml
sinks:
  balances:
    type: postgres_aggregate
    from: transfers
    schema: public
    landing_table: transfer_log
    agg_table: account_balances
    primary_key: transfer_id
    secret_name: MY_POSTGRES
    group_by:
      account:
        type: text
    aggregate:
      balance:
        from: amount
        fn: sum
支持的聚合函数:
sum
count
avg
min
max

ClickHouse Sink

ClickHouse接收器

yaml
sinks:
  clickhouse_output:
    type: clickhouse
    from: my_transform
    table: my_table
    secret_name: MY_CLICKHOUSE_SECRET
    primary_key: id
Secret format: ClickHouse connection string:
https://username:password@host:port/database
yaml
sinks:
  clickhouse_output:
    type: clickhouse
    from: my_transform
    table: my_table
    secret_name: MY_CLICKHOUSE_SECRET
    primary_key: id
密钥格式: ClickHouse连接字符串:
https://username:password@host:port/database

Kafka Sink

Kafka接收器

yaml
sinks:
  kafka_output:
    type: kafka
    from: my_transform
    topic: my-topic
    topic_partitions: 10
    data_format: avro          # or: json
    schema_registry_url: http://schema-registry:8081  # required for avro
yaml
sinks:
  kafka_output:
    type: kafka
    from: my_transform
    topic: my-topic
    topic_partitions: 10
    data_format: avro          # 或:json
    schema_registry_url: http://schema-registry:8081  # 使用avro时必填

Webhook Sink

Webhook接收器

Note: Turbo webhook sinks do not support Goldsky's native secrets management. Include auth headers directly in the pipeline config.
yaml
sinks:
  webhook_output:
    type: webhook
    from: my_transform
    url: https://api.example.com/webhook
    one_row_per_request: true
    headers:
      Authorization: Bearer your-token
      Content-Type: application/json
注意:Turbo webhook接收器支持Goldsky原生密钥管理。请在管道配置中直接包含认证头。
yaml
sinks:
  webhook_output:
    type: webhook
    from: my_transform
    url: https://api.example.com/webhook
    one_row_per_request: true
    headers:
      Authorization: Bearer your-token
      Content-Type: application/json

S3 Sink

S3接收器

yaml
sinks:
  s3_output:
    type: s3_sink
    from: my_transform
    endpoint: https://s3.amazonaws.com
    bucket: my-bucket
    prefix: data/
    secret_name: MY_S3_SECRET
Secret format:
access_key_id:secret_access_key
(or
access_key_id:secret_access_key:session_token
for temporary credentials)
yaml
sinks:
  s3_output:
    type: s3_sink
    from: my_transform
    endpoint: https://s3.amazonaws.com
    bucket: my-bucket
    prefix: data/
    secret_name: MY_S3_SECRET
密钥格式:
access_key_id:secret_access_key
(临时凭证格式:
access_key_id:secret_access_key:session_token

S2 Sink

S2接收器

Publish to S2.dev streams — a serverless alternative to Kafka.
yaml
sinks:
  s2_output:
    type: s2_sink
    from: my_transform
    access_token: your_access_token
    basin: your-basin-name
    stream: your-stream-name

发布到S2.dev流——Kafka的无服务器替代方案。
yaml
sinks:
  s2_output:
    type: s2_sink
    from: my_transform
    access_token: your_access_token
    basin: your-basin-name
    stream: your-stream-name

Starter Templates

入门模板

Template files are available in the
templates/
folder.
Copy and customize these for your pipelines.
TemplateDescriptionUse Case
minimal-erc20-blackhole.yaml
Simplest pipeline, no credentialsQuick testing
filtered-transfers-sql.yaml
Filter by contract addressUSDC, specific tokens
postgres-output.yaml
Write to PostgreSQLProduction data storage
multi-chain-pipeline.yaml
Combine multiple chainsCross-chain analytics
solana-transfers.yaml
Solana SPL tokensNon-EVM chains
multi-sink-pipeline.yaml
Multiple outputsArchive + alerts + streaming
To use a template:
bash
undefined
模板文件位于
templates/
文件夹中。
复制并自定义这些模板以创建你的管道。
模板名称描述使用场景
minimal-erc20-blackhole.yaml
最简单的管道,无需凭证快速测试
filtered-transfers-sql.yaml
按合约地址过滤USDC、特定代币场景
postgres-output.yaml
写入PostgreSQL生产环境数据存储
multi-chain-pipeline.yaml
合并多条链的数据跨链分析
solana-transfers.yaml
Solana SPL代币非EVM链场景
multi-sink-pipeline.yaml
多输出节点归档+告警+流处理
使用模板的步骤:
bash
undefined

Copy template to your project

复制模板到你的项目

cp templates/minimal-erc20-blackhole.yaml my-pipeline.yaml
cp templates/minimal-erc20-blackhole.yaml my-pipeline.yaml

Customize as needed, then validate

根据需要自定义,然后验证

goldsky turbo validate my-pipeline.yaml
goldsky turbo validate my-pipeline.yaml

Deploy

部署

goldsky turbo apply my-pipeline.yaml -i

**Template location:** `templates/` (relative to this skill's directory)

---
goldsky turbo apply my-pipeline.yaml -i

**模板位置:** `templates/`(相对于本技能的目录)

---

Common Update Patterns

常见更新模式

Adding a SQL Transform

添加SQL转换

Before:
yaml
transforms: {}
sinks:
  output:
    type: blackhole
    from: transfers
After:
yaml
transforms:
  filtered:
    type: sql
    primary_key: id
    sql: |
      SELECT * FROM transfers WHERE amount > 1000000
sinks:
  output:
    type: blackhole
    from: filtered # Changed from 'transfers'
更新前:
yaml
transforms: {}
sinks:
  output:
    type: blackhole
    from: transfers
更新后:
yaml
transforms:
  filtered:
    type: sql
    primary_key: id
    sql: |
      SELECT * FROM transfers WHERE amount > 1000000
sinks:
  output:
    type: blackhole
    from: filtered # 从'transfers'改为'filtered'

Adding a PostgreSQL Sink

添加PostgreSQL接收器

yaml
sinks:
  existing_sink:
    type: blackhole
    from: my_transform
  # Add new sink
  postgres_output:
    type: postgres
    from: my_transform
    schema: public
    table: my_data
    secret_name: MY_POSTGRES_SECRET
    primary_key: id
yaml
sinks:
  existing_sink:
    type: blackhole
    from: my_transform
  # 添加新接收器
  postgres_output:
    type: postgres
    from: my_transform
    schema: public
    table: my_data
    secret_name: MY_POSTGRES_SECRET
    primary_key: id

Changing Resource Size

修改资源规格

yaml
resource_size: m # was: s
yaml
resource_size: m # 之前是:s

Adding a New Source

添加新数据源

yaml
sources:
  eth_transfers:
    type: dataset
    dataset_name: ethereum.erc20_transfers
    version: 1.0.0
    start_at: latest
  # Add new source
  base_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

yaml
sources:
  eth_transfers:
    type: dataset
    dataset_name: ethereum.erc20_transfers
    version: 1.0.0
    start_at: latest
  # 添加新数据源
  base_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

Checkpoint Behavior

检查点行为

Understanding Checkpoints

理解检查点

When you update a pipeline:
  • Checkpoints are preserved by default - Processing continues from where it left off
  • Source checkpoints are tied to source names - Renaming a source resets its checkpoint
  • Pipeline checkpoints are tied to pipeline names - Renaming the pipeline resets all checkpoints
当你更新管道时:
  • 默认保留检查点——处理从上次中断的位置继续
  • 数据源检查点与数据源名称绑定——重命名数据源会重置其检查点
  • 管道检查点与管道名称绑定——重命名管道会重置所有检查点

Resetting Checkpoints

重置检查点

Option 1: Rename the source
yaml
sources:
  transfers_v2: # Changed from 'transfers'
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: earliest # Will process from beginning
Option 2: Rename the pipeline
yaml
name: my-pipeline-v2 # Changed from 'my-pipeline'
Warning: Resetting checkpoints means reprocessing all historical data.

选项1:重命名数据源
yaml
sources:
  transfers_v2: # 从'transfers'修改而来
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: earliest # 将从最开始处理
选项2:重命名管道
yaml
name: my-pipeline-v2 # 从'my-pipeline'修改而来
警告: 重置检查点意味着需要重新处理所有历史数据。

Troubleshooting

故障排查

See
references/troubleshooting.md
for:
  • CLI hanging / Turbo binary not found fixes
  • Common validation errors (unknown dataset, missing primary_key, bad source reference)
  • Common runtime errors (auth failed, connection refused, Neon size limit)
  • Quick troubleshooting table
Also see
/turbo-monitor-debug
for error patterns and log analysis.

请查看
references/troubleshooting.md
获取以下内容:
  • CLI卡住/Turbo二进制文件未找到的解决方法
  • 常见验证错误(未知数据集、缺失primary_key、无效数据源引用)
  • 常见运行时错误(认证失败、连接拒绝、Neon大小限制)
  • 快速故障排查表
如需错误模式和日志分析,请查看
/turbo-monitor-debug

Related

相关资源

  • /turbo-builder
    — Interactive wizard to build pipelines step-by-step
  • /turbo-doctor
    — Diagnose and fix pipeline issues
  • /datasets
    — Dataset names and chain prefixes
  • /turbo-builder
    —— 交互式向导,分步构建管道
  • /turbo-doctor
    —— 诊断并修复管道问题
  • /datasets
    —— 数据集名称和链前缀