turbo-builder

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Pipeline Builder

流水线构建器

Boundaries

边界

Build NEW pipelines. Do not diagnose broken pipelines — that belongs to
```
/turbo-doctor
```
.
Do not serve as a YAML reference. If the user only needs to look up a field or syntax, use the
```
/turbo-pipelines
```
skill instead.
For dataset lookups, use
```
/datasets
```
.

Walk the user through building a complete pipeline from scratch, step by step. Generate a valid YAML configuration, validate it, and deploy it.

仅构建全新流水线。不要诊断已损坏的流水线——该场景请使用
```
/turbo-doctor
```
。
不提供YAML参考功能。如果用户仅需要查询字段或语法，请使用
```
/turbo-pipelines
```
技能。
如需查询数据集，请使用
```
/datasets
```
。

引导用户一步步从零开始构建完整的流水线，生成有效的YAML配置，对其进行校验，然后部署。

Builder Workflow

构建器工作流

Step 1: Verify Authentication

步骤1：校验身份认证

Run

goldsky project list 2>&1

to check login status.

If logged in: Note the current project and continue.
If not logged in: Use the
```
/auth-setup
```
skill for guidance.

运行

goldsky project list 2>&1

检查登录状态。

已登录： 记录当前项目，继续后续流程。
未登录： 使用
```
/auth-setup
```
技能获取指引。

Step 2: Understand the Goal

步骤2：明确目标

Ask the user what they want to index. Good questions:

What blockchain/chain? (Ethereum, Base, Polygon, Solana, etc.)
What data? (transfers, swaps, events from a specific contract, all transactions, etc.)
Where should the data go? (PostgreSQL, ClickHouse, Kafka, S3, etc.)
Do they need transforms? (filtering, aggregation, enrichment)
One-time backfill or continuous streaming?

If the user already described their goal, extract answers from their description.

询问用户需要索引的内容。参考问题：

目标区块链/公链？（Ethereum、Base、Polygon、Solana等）
需要什么数据？（转账、兑换、特定合约的事件、所有交易等）
数据要同步到哪里？（PostgreSQL、ClickHouse、Kafka、S3等）
是否需要数据转换？（过滤、聚合、数据丰富）
是一次性历史数据回填还是持续流处理？

如果用户已经描述过目标，直接从描述中提取对应答案即可。

Step 3: Choose the Dataset

步骤3：选择数据集

Use the

/datasets

skill to find the right dataset.

Key points:

Common datasets:

<chain>.decoded_logs

<chain>.raw_transactions

<chain>.erc20_transfers

<chain>.raw_traces

For decoded contract events, use
```
<chain>.decoded_logs
```
with a filter on
```
address
```
and
```
topic0
```

For Solana: use

solana.transactions

solana.token_transfers

, etc.

Present the dataset choice to the user for confirmation.

使用

/datasets

技能查找合适的数据集。

要点：

常用数据集：

<chain>.decoded_logs

、

<chain>.raw_transactions

、

<chain>.erc20_transfers

、

<chain>.raw_traces

对于已解码的合约事件，使用
```
<chain>.decoded_logs
```
并对
```
address
```
和
```
topic0
```
添加过滤条件

对于Solana：使用

solana.transactions

、

solana.token_transfers

等

将数据集选择结果告知用户确认。

Step 4: Configure the Source

步骤4：配置数据源

Build the source section of the YAML:

yaml

sources:
  my_source:
    type: dataset
    dataset_name: <chain>.<dataset>
    version: 1.0.0
    start_at: earliest  # or a specific block number

Ask about:

Start block:
```
earliest
```
(from genesis),
```
latest
```
(from now), or a specific block number
End block: Only for job-mode/backfill pipelines. Omit for streaming.
Source-level filter: Optional filter to reduce data at the source (e.g., specific contract address)

构建YAML的source部分：

yaml

sources:
  my_source:
    type: dataset
    dataset_name: <chain>.<dataset>
    version: 1.0.0
    start_at: earliest  # or a specific block number

询问以下信息：

起始区块：
```
earliest
```
（从创世块开始）、
```
latest
```
（从当前区块开始），或是指定的区块号
结束区块： 仅用于任务模式/回填流水线，流处理模式可省略
数据源层级过滤： 可选的过滤规则，用于在数据源端减少数据量（例如特定合约地址）

Step 5: Configure Transforms (if needed)

步骤5：配置转换规则（如需）

If the user needs transforms, use the

/turbo-transforms

skill to help:

SQL transforms — filter, aggregate, join, or reshape data using DataFusion SQL
TypeScript transforms — custom logic, external API calls, complex processing
Dynamic tables — join with a PostgreSQL table or in-memory allowlist

Build the transforms section:

yaml

transforms:
  my_transform:
    type: sql
    primary_key: id
    sql: |
      SELECT * FROM my_source
      WHERE <conditions>

如果用户需要数据转换，使用

/turbo-transforms

技能协助：

SQL转换——使用DataFusion SQL对数据进行过滤、聚合、关联或重构
TypeScript转换——自定义逻辑、外部API调用、复杂处理
动态表——与PostgreSQL表或内存白名单关联

构建transforms部分：

yaml

transforms:
  my_transform:
    type: sql
    primary_key: id
    sql: |
      SELECT * FROM my_source
      WHERE <conditions>

Step 6: Configure the Sink

步骤6：配置输出端（Sink）

Ask where the data should go. Use the

/turbo-pipelines

skill for sink configuration:

Sink	Key config
PostgreSQL	`secret_name` , `schema` , `table` , `primary_key`
ClickHouse	`secret_name` , `table` , `order_by`
Kafka	`secret_name` , `topic`
S3	`bucket` , `region` , `prefix` , `format`
Webhook	`url` , `format`

For sinks requiring

secret_name

, check if the secret exists:

bash

goldsky secret list

If it doesn't exist, help create it using the

/secrets

skill.

询问数据需要同步到哪里。使用

/turbo-pipelines

技能获取输出端配置指引：

输出端	核心配置
PostgreSQL	`secret_name` , `schema` , `table` , `primary_key`
ClickHouse	`secret_name` , `table` , `order_by`
Kafka	`secret_name` , `topic`
S3	`bucket` , `region` , `prefix` , `format`
Webhook	`url` , `format`

对于需要

secret_name

的输出端，检查密钥是否存在：

bash

goldsky secret list

如果不存在，使用

/secrets

技能协助创建。

Step 7: Choose Mode

步骤7：选择运行模式

Use the

/turbo-pipelines

skill for guidance:

Streaming (default) — continuous processing, no
```
end_block
```
, runs indefinitely
Job mode — one-time backfill, set
```
job: true
```
and
```
end_block
```

使用

/turbo-pipelines

技能获取指引：

流处理（默认）——持续处理，无
```
end_block
```
，永久运行
任务模式——一次性历史回填，设置
```
job: true
```
和
```
end_block
```

Step 8: Generate, Validate, and Present

步骤8：生成、校验并展示

Assemble the complete pipeline YAML. Use a descriptive name following the convention:

<chain>-<data>-<sink>

(e.g.,

base-erc20-transfers-postgres

Write the YAML file to disk (e.g.,
```
<pipeline-name>.yaml
```
).
Run validation BEFORE showing the YAML to the user:

bash

goldsky turbo validate -f <pipeline-name>.yaml

If validation fails, fix the issues and re-validate. Do NOT present the YAML until validation passes. Common fixes:
- Missing
```
version
```
  field on dataset source
- Invalid dataset name (check chain prefix)
- Missing
```
secret_name
```
  for database sinks
- SQL syntax errors in transforms
Once validation passes, present the full YAML to the user for review.

组装完整的流水线YAML。使用遵循

<chain>-<data>-<sink>

格式的描述性名称（例如

base-erc20-transfers-postgres

）。

将YAML文件写入磁盘（例如
```
<pipeline-name>.yaml
```
）。
在向用户展示YAML之前先运行校验：

bash

goldsky turbo validate -f <pipeline-name>.yaml

如果校验失败，修复问题后重新校验。校验通过前不要向用户展示YAML。常见修复项：
- 数据集源缺少
```
version
```
  字段
- 数据集名称无效（检查链前缀）
- 数据库输出端缺少
```
secret_name
```
- 转换规则中的SQL语法错误
校验通过后，向用户展示完整YAML供审核。

Step 9: Deploy

步骤9：部署

After user confirms the YAML looks good:

bash

goldsky turbo apply <pipeline-name>.yaml

用户确认YAML无误后：

bash

goldsky turbo apply <pipeline-name>.yaml

Step 10: Verify

步骤10：验证

After deployment:

bash

goldsky turbo list

Suggest running inspect to verify data flow:

bash

goldsky turbo inspect <pipeline-name>

Present a summary:

undefined

部署完成后：

bash

goldsky turbo list

建议运行inspect命令验证数据流：

bash

goldsky turbo inspect <pipeline-name>

展示汇总信息：

undefined

Pipeline Deployed

流水线已部署

Name: [name] Chain: [chain] Dataset: [dataset] Sink: [sink type] Mode: [streaming/job]

Next steps:

Monitor with
```
goldsky turbo inspect <name>
```
Check logs with
```
goldsky turbo logs <name>
```
Use /turbo-doctor if you run into issues

undefined

名称： [name] 链： [chain] 数据集： [dataset] 输出端： [sink type] 模式： [streaming/job]

后续步骤：

使用
```
goldsky turbo inspect <name>
```
监控
使用
```
goldsky turbo logs <name>
```
查看日志
遇到问题请使用
```
/turbo-doctor
```

undefined

Important Rules

重要规则

Always validate before presenting complete YAML to the user. Never show unvalidated complete pipeline YAML.
Always validate before deploying.
Always show the user the complete YAML before deploying.
For job-mode pipelines, remind the user they auto-cleanup ~1hr after completion.
Use
```
blackhole
```
sink for testing pipelines without writing to a real destination.
If the user wants to modify an existing pipeline, check if it's streaming (update in place) or job-mode (must delete first).
Default to
```
start_at: earliest
```
unless the user specifies otherwise.
Always include
```
version: 1.0.0
```
on dataset sources.

向用户展示完整YAML前必须先校验，永远不要展示未校验的完整流水线YAML。
部署前必须先校验。
部署前必须向用户展示完整YAML。
对于任务模式流水线，提醒用户任务完成后约1小时会自动清理。
测试流水线时使用
```
blackhole
```
输出端，无需写入真实目标存储。
如果用户需要修改现有流水线，检查是否为流处理模式（可原地更新）或任务模式（必须先删除）。
除非用户另行指定，默认使用
```
start_at: earliest
```
。
数据集源必须始终包含
```
version: 1.0.0
```
。

turbo-builder

Original

Translation

Pipeline Builder

流水线构建器

Boundaries

边界

Builder Workflow

构建器工作流

Step 1: Verify Authentication

步骤1：校验身份认证

Step 2: Understand the Goal

步骤2：明确目标

Step 3: Choose the Dataset

步骤3：选择数据集

Step 4: Configure the Source

步骤4：配置数据源

Step 5: Configure Transforms (if needed)

步骤5：配置转换规则（如需）

Step 6: Configure the Sink

步骤6：配置输出端（Sink）

Step 7: Choose Mode

步骤7：选择运行模式

Step 8: Generate, Validate, and Present

步骤8：生成、校验并展示

Step 9: Deploy

步骤9：部署

Step 10: Verify

步骤10：验证

Pipeline Deployed

流水线已部署

Important Rules

重要规则

Related

相关资源