parallel-data-enrichment

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Data Enrichment

数据补全

Enrich: $ARGUMENTS
补全操作:$ARGUMENTS

Before starting

开始之前

Inform the user that enrichment may take several minutes depending on the number of rows and fields requested.
告知用户,补全所需时间取决于请求的行数和字段数量,可能需要几分钟。

Step 1: Start the enrichment

步骤1:启动补全任务

Use ONE of these command patterns (substitute user's actual data):
For inline data:
bash
parallel-cli enrich run --data '[{"company": "Google"}, {"company": "Microsoft"}]' --intent "CEO name and founding year" --target "output.csv" --no-wait
For CSV file:
bash
parallel-cli enrich run --source-type csv --source "input.csv" --target "output.csv" --source-columns '[{"name": "company", "description": "Company name"}]' --intent "CEO name and founding year" --no-wait
IMPORTANT: Always include
--no-wait
so the command returns immediately instead of blocking.
Parse the output to extract the
taskgroup_id
and monitoring URL. Immediately tell the user:
  • Enrichment has been kicked off
  • The monitoring URL where they can track progress
Tell them they can background the polling step to continue working while it runs.
使用以下任一命令模式(替换为用户的实际数据):
针对内嵌数据:
bash
parallel-cli enrich run --data '[{"company": "Google"}, {"company": "Microsoft"}]' --intent "CEO name and founding year" --target "output.csv" --no-wait
针对CSV文件:
bash
parallel-cli enrich run --source-type csv --source "input.csv" --target "output.csv" --source-columns '[{"name": "company", "description": "Company name"}]' --intent "CEO name and founding year" --no-wait
重要提示: 务必添加
--no-wait
参数,让命令立即返回,而非阻塞等待。
解析输出内容以提取
taskgroup_id
和监控URL。立即告知用户:
  • 补全任务已启动
  • 可用于跟踪进度的监控URL
告知用户可以将轮询步骤置于后台,以便在任务运行期间继续其他工作。

Step 2: Poll for results

步骤2:轮询任务结果

bash
parallel-cli enrich poll "$TASKGROUP_ID" --timeout 540
Important:
  • Use
    --timeout 540
    (9 minutes) to stay within tool execution limits
bash
parallel-cli enrich poll "$TASKGROUP_ID" --timeout 540
注意事项:
  • 使用
    --timeout 540
    (9分钟)以符合工具执行限制

If the poll times out

若轮询超时

Enrichment of large datasets can take longer than 9 minutes. If the poll exits without completing:
  1. Tell the user the enrichment is still running server-side
  2. Re-run the same
    parallel-cli enrich poll
    command to continue waiting
大型数据集的补全耗时可能超过9分钟。如果轮询未完成就退出:
  1. 告知用户补全任务仍在服务器端运行
  2. 重新执行相同的
    parallel-cli enrich poll
    命令以继续等待

Response format

响应格式

After step 1: Share the monitoring URL (for tracking progress).
After step 2:
  1. Report number of rows enriched
  2. Preview first few rows of the output CSV
  3. Tell user the full path to the output CSV file
Do NOT re-share the monitoring URL after completion — the results are in the output file.
步骤1完成后: 分享监控URL(用于跟踪进度)。
步骤2完成后:
  1. 报告已补全的行数
  2. 预览输出CSV的前几行内容
  3. 告知用户输出CSV文件的完整路径
任务完成后请勿再次分享监控URL——结果已保存至输出文件中。

Setup

环境配置

If
parallel-cli
is not found, install and authenticate:
bash
curl -fsSL https://parallel.ai/install.sh | bash
If unable to install that way, install via pipx instead:
bash
pipx install "parallel-web-tools[cli]"
pipx ensurepath
Then authenticate:
bash
parallel-cli login
Or set an API key:
export PARALLEL_API_KEY="your-key"
若未找到
parallel-cli
,请进行安装并认证:
bash
curl -fsSL https://parallel.ai/install.sh | bash
若无法通过上述方式安装,可改用pipx安装:
bash
pipx install "parallel-web-tools[cli]"
pipx ensurepath
然后进行认证:
bash
parallel-cli login
或设置API密钥:
export PARALLEL_API_KEY="your-key"