turbo-builder
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePipeline Builder
流水线构建器
Boundaries
边界
- Build NEW pipelines. Do not diagnose broken pipelines — that belongs to .
/turbo-doctor - Do not serve as a YAML reference. If the user only needs to look up a field or syntax, use the skill instead.
/turbo-pipelines - For dataset lookups, use .
/datasets
Walk the user through building a complete pipeline from scratch, step by step. Generate a valid YAML configuration, validate it, and deploy it.
- 仅构建全新流水线。不要诊断已损坏的流水线——该场景请使用 。
/turbo-doctor - 不提供YAML参考功能。如果用户仅需要查询字段或语法,请使用 技能。
/turbo-pipelines - 如需查询数据集,请使用 。
/datasets
引导用户一步步从零开始构建完整的流水线,生成有效的YAML配置,对其进行校验,然后部署。
Builder Workflow
构建器工作流
Step 1: Verify Authentication
步骤1:校验身份认证
Run to check login status.
goldsky project list 2>&1- If logged in: Note the current project and continue.
- If not logged in: Use the skill for guidance.
/auth-setup
运行 检查登录状态。
goldsky project list 2>&1- 已登录: 记录当前项目,继续后续流程。
- 未登录: 使用 技能获取指引。
/auth-setup
Step 2: Understand the Goal
步骤2:明确目标
Ask the user what they want to index. Good questions:
- What blockchain/chain? (Ethereum, Base, Polygon, Solana, etc.)
- What data? (transfers, swaps, events from a specific contract, all transactions, etc.)
- Where should the data go? (PostgreSQL, ClickHouse, Kafka, S3, etc.)
- Do they need transforms? (filtering, aggregation, enrichment)
- One-time backfill or continuous streaming?
If the user already described their goal, extract answers from their description.
询问用户需要索引的内容。参考问题:
- 目标区块链/公链?(Ethereum、Base、Polygon、Solana等)
- 需要什么数据?(转账、兑换、特定合约的事件、所有交易等)
- 数据要同步到哪里?(PostgreSQL、ClickHouse、Kafka、S3等)
- 是否需要数据转换?(过滤、聚合、数据丰富)
- 是一次性历史数据回填还是持续流处理?
如果用户已经描述过目标,直接从描述中提取对应答案即可。
Step 3: Choose the Dataset
步骤3:选择数据集
Use the skill to find the right dataset.
/datasetsKey points:
- Common datasets: ,
<chain>.decoded_logs,<chain>.raw_transactions,<chain>.erc20_transfers<chain>.raw_traces - For decoded contract events, use with a filter on
<chain>.decoded_logsandaddresstopic0 - For Solana: use ,
solana.transactions, etc.solana.token_transfers
Present the dataset choice to the user for confirmation.
使用 技能查找合适的数据集。
/datasets要点:
- 常用数据集:、
<chain>.decoded_logs、<chain>.raw_transactions、<chain>.erc20_transfers<chain>.raw_traces - 对于已解码的合约事件,使用 并对
<chain>.decoded_logs和address添加过滤条件topic0 - 对于Solana:使用 、
solana.transactions等solana.token_transfers
将数据集选择结果告知用户确认。
Step 4: Configure the Source
步骤4:配置数据源
Build the source section of the YAML:
yaml
sources:
my_source:
type: dataset
dataset_name: <chain>.<dataset>
version: 1.0.0
start_at: earliest # or a specific block numberAsk about:
- Start block: (from genesis),
earliest(from now), or a specific block numberlatest - End block: Only for job-mode/backfill pipelines. Omit for streaming.
- Source-level filter: Optional filter to reduce data at the source (e.g., specific contract address)
构建YAML的source部分:
yaml
sources:
my_source:
type: dataset
dataset_name: <chain>.<dataset>
version: 1.0.0
start_at: earliest # or a specific block number询问以下信息:
- 起始区块: (从创世块开始)、
earliest(从当前区块开始),或是指定的区块号latest - 结束区块: 仅用于任务模式/回填流水线,流处理模式可省略
- 数据源层级过滤: 可选的过滤规则,用于在数据源端减少数据量(例如特定合约地址)
Step 5: Configure Transforms (if needed)
步骤5:配置转换规则(如需)
If the user needs transforms, use the skill to help:
/turbo-transforms- SQL transforms — filter, aggregate, join, or reshape data using DataFusion SQL
- TypeScript transforms — custom logic, external API calls, complex processing
- Dynamic tables — join with a PostgreSQL table or in-memory allowlist
Build the transforms section:
yaml
transforms:
my_transform:
type: sql
primary_key: id
sql: |
SELECT * FROM my_source
WHERE <conditions>如果用户需要数据转换,使用 技能协助:
/turbo-transforms- SQL转换——使用DataFusion SQL对数据进行过滤、聚合、关联或重构
- TypeScript转换——自定义逻辑、外部API调用、复杂处理
- 动态表——与PostgreSQL表或内存白名单关联
构建transforms部分:
yaml
transforms:
my_transform:
type: sql
primary_key: id
sql: |
SELECT * FROM my_source
WHERE <conditions>Step 6: Configure the Sink
步骤6:配置输出端(Sink)
Ask where the data should go. Use the skill for sink configuration:
/turbo-pipelines| Sink | Key config |
|---|---|
| PostgreSQL | |
| ClickHouse | |
| Kafka | |
| S3 | |
| Webhook | |
For sinks requiring , check if the secret exists:
secret_namebash
goldsky secret listIf it doesn't exist, help create it using the skill.
/secrets询问数据需要同步到哪里。使用 技能获取输出端配置指引:
/turbo-pipelines| 输出端 | 核心配置 |
|---|---|
| PostgreSQL | |
| ClickHouse | |
| Kafka | |
| S3 | |
| Webhook | |
对于需要 的输出端,检查密钥是否存在:
secret_namebash
goldsky secret list如果不存在,使用 技能协助创建。
/secretsStep 7: Choose Mode
步骤7:选择运行模式
Use the skill for guidance:
/turbo-pipelines- Streaming (default) — continuous processing, no , runs indefinitely
end_block - Job mode — one-time backfill, set and
job: trueend_block
使用 技能获取指引:
/turbo-pipelines- 流处理(默认)——持续处理,无 ,永久运行
end_block - 任务模式——一次性历史回填,设置 和
job: trueend_block
Step 8: Generate, Validate, and Present
步骤8:生成、校验并展示
Assemble the complete pipeline YAML. Use a descriptive name following the convention: (e.g., ).
<chain>-<data>-<sink>base-erc20-transfers-postgres- Write the YAML file to disk (e.g., ).
<pipeline-name>.yaml - Run validation BEFORE showing the YAML to the user:
bash
goldsky turbo validate -f <pipeline-name>.yaml-
If validation fails, fix the issues and re-validate. Do NOT present the YAML until validation passes. Common fixes:
- Missing field on dataset source
version - Invalid dataset name (check chain prefix)
- Missing for database sinks
secret_name - SQL syntax errors in transforms
- Missing
-
Once validation passes, present the full YAML to the user for review.
组装完整的流水线YAML。使用遵循 格式的描述性名称(例如 )。
<chain>-<data>-<sink>base-erc20-transfers-postgres- 将YAML文件写入磁盘(例如 )。
<pipeline-name>.yaml - 在向用户展示YAML之前先运行校验:
bash
goldsky turbo validate -f <pipeline-name>.yaml-
如果校验失败,修复问题后重新校验。校验通过前不要向用户展示YAML。常见修复项:
- 数据集源缺少 字段
version - 数据集名称无效(检查链前缀)
- 数据库输出端缺少
secret_name - 转换规则中的SQL语法错误
- 数据集源缺少
-
校验通过后,向用户展示完整YAML供审核。
Step 9: Deploy
步骤9:部署
After user confirms the YAML looks good:
bash
goldsky turbo apply <pipeline-name>.yaml用户确认YAML无误后:
bash
goldsky turbo apply <pipeline-name>.yamlStep 10: Verify
步骤10:验证
After deployment:
bash
goldsky turbo listSuggest running inspect to verify data flow:
bash
goldsky turbo inspect <pipeline-name>Present a summary:
undefined部署完成后:
bash
goldsky turbo list建议运行inspect命令验证数据流:
bash
goldsky turbo inspect <pipeline-name>展示汇总信息:
undefinedPipeline Deployed
流水线已部署
Name: [name]
Chain: [chain]
Dataset: [dataset]
Sink: [sink type]
Mode: [streaming/job]
Next steps:
- Monitor with
goldsky turbo inspect <name> - Check logs with
goldsky turbo logs <name> - Use /turbo-doctor if you run into issues
undefined名称: [name]
链: [chain]
数据集: [dataset]
输出端: [sink type]
模式: [streaming/job]
后续步骤:
- 使用 监控
goldsky turbo inspect <name> - 使用 查看日志
goldsky turbo logs <name> - 遇到问题请使用
/turbo-doctor
undefinedImportant Rules
重要规则
- Always validate before presenting complete YAML to the user. Never show unvalidated complete pipeline YAML.
- Always validate before deploying.
- Always show the user the complete YAML before deploying.
- For job-mode pipelines, remind the user they auto-cleanup ~1hr after completion.
- Use sink for testing pipelines without writing to a real destination.
blackhole - If the user wants to modify an existing pipeline, check if it's streaming (update in place) or job-mode (must delete first).
- Default to unless the user specifies otherwise.
start_at: earliest - Always include on dataset sources.
version: 1.0.0
- 向用户展示完整YAML前必须先校验,永远不要展示未校验的完整流水线YAML。
- 部署前必须先校验。
- 部署前必须向用户展示完整YAML。
- 对于任务模式流水线,提醒用户任务完成后约1小时会自动清理。
- 测试流水线时使用 输出端,无需写入真实目标存储。
blackhole - 如果用户需要修改现有流水线,检查是否为流处理模式(可原地更新)或任务模式(必须先删除)。
- 除非用户另行指定,默认使用 。
start_at: earliest - 数据集源必须始终包含 。
version: 1.0.0
Related
相关资源
- — YAML configuration and architecture reference
/turbo-pipelines - — Diagnose and fix pipeline issues
/turbo-doctor - — Lifecycle commands and monitoring reference
/turbo-operations - — SQL and TypeScript transform reference
/turbo-transforms - — Dataset names and chain prefixes
/datasets - — Sink credential management
/secrets
- —— YAML配置和架构参考
/turbo-pipelines - —— 诊断并修复流水线问题
/turbo-doctor - —— 生命周期命令和监控参考
/turbo-operations - —— SQL和TypeScript转换规则参考
/turbo-transforms - —— 数据集名称和链前缀参考
/datasets - —— 输出端凭证管理
/secrets