inputs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Prepare Inputs for MTHDS methods

为MTHDS方法准备输入

Prepare input data for running MTHDS method bundles. This skill is the single entry point for all input preparation needs: extracting a placeholder template, generating synthetic test data, integrating user-provided files, or any combination.
为运行MTHDS方法包准备输入数据。本技能是所有输入准备需求的统一入口:提取占位符模板、生成合成测试数据、整合用户提供的文件,或以上任意组合。

Mode Selection

模式选择

See Mode Selection for general mode behavior.
Default: Automatic.
Input strategy detection heuristics (evaluated in order):
SignalStrategy
User provides file paths, folder paths, or mentions "my data" / "this file" / "use these images" / "here's my PDF"User Data (or Mixed if some inputs remain unfilled)
User says "test data" / "generate inputs" / "synthesize" / "fake data" / "sample data"Synthetic
User says "template" / "schema" / "placeholder" / "what inputs does it need?"Template
No clear signal (e.g., called after
/build
with no further context)
Template, then offer to populate
Interactive additions: Ask about:
  • Which user files map to which inputs (when ambiguous)
  • Domain/industry context for realistic synthetic data
  • Whether to generate edge cases or happy-path data
  • Specific values or constraints for certain fields

有关通用模式行为,请参阅模式选择
默认值:自动模式。
输入策略检测规则(按顺序评估):
信号策略
用户提供文件路径、文件夹路径,或提及“my data”/“this file”/“use these images”/“here's my PDF”用户数据(若部分输入未填充则为混合模式)
用户说出“test data”/“generate inputs”/“synthesize”/“fake data”/“sample data”合成数据
用户说出“template”/“schema”/“placeholder”/“它需要哪些输入?”模板
无明确信号(例如,在
/build
之后调用且无更多上下文)
模板,随后提供填充选项
交互式补充:询问以下内容:
  • 当匹配存在歧义时,用户文件与哪些输入对应
  • 生成真实合成数据所需的领域/行业上下文
  • 是否生成边缘案例或正常流程数据
  • 特定字段的具体值或约束条件

Prerequisites

前置条件

See CLI Prerequisites

请参阅CLI前置条件

Process

流程

Step 1: Identify the Target Method

步骤1:确定目标方法

Determine the
.mthds
bundle and its output directory (
<output_dir>
). This is usually the directory containing
bundle.mthds
(e.g.,
mthds-wip/pipeline_01/
).
The
inputs.json
file is saved directly in this directory (next to
bundle.mthds
):
  • <output_dir>/inputs.json
If data files need to be generated or copied (images, PDFs, etc.), they go in a subdirectory:
  • <output_dir>/inputs/
The
/inputs
subdirectory is only created when there are actual data files to store. Paths to these files are referenced from within
inputs.json
.
确定
.mthds
包及其输出目录(
<output_dir>
)。通常为包含
bundle.mthds
的目录(例如
mthds-wip/pipeline_01/
)。
inputs.json
文件直接保存在该目录下(与
bundle.mthds
同级):
  • <output_dir>/inputs.json
若需要生成或复制数据文件(图片、PDF等),则保存至子目录:
  • <output_dir>/inputs/
仅当存在实际数据文件需要存储时,才会创建
/inputs
子目录。这些文件的路径会在
inputs.json
中被引用。

Step 2: Get Input Schema

步骤2:获取输入Schema

Extract the input template from the method:
bash
mthds-agent pipelex inputs <bundle.mthds> -L <bundle-dir>/ [--pipe specific_pipe]
Output format:
json
{
  "success": true,
  "pipe_code": "process_document",
  "inputs": {
    "document": {
      "concept": "native.Document",
      "content": {"url": "url_value"}
    },
    "context": {
      "concept": "native.Text",
      "content": {"text": "text_value"}
    }
  }
}
For error handling, see Error Handling Reference.
从方法中提取输入模板:
bash
mthds-agent pipelex inputs <bundle.mthds> -L <bundle-dir>/ [--pipe specific_pipe]
输出格式:
json
{
  "success": true,
  "pipe_code": "process_document",
  "inputs": {
    "document": {
      "concept": "native.Document",
      "content": {"url": "url_value"}
    },
    "context": {
      "concept": "native.Text",
      "content": {"text": "text_value"}
    }
  }
}
有关错误处理,请参阅错误处理参考

Step 3: Choose Input Strategy

步骤3:选择输入策略

Based on the heuristics above and what the user has provided, follow the appropriate strategy:

根据上述规则及用户提供的内容,选择合适的策略:

Template Strategy

模板策略

The fastest path. Produces a placeholder
inputs.json
that the user can fill in manually.
  1. Take the
    inputs
    object from Step 2's output
  2. Save it to
    <output_dir>/inputs.json
    (next to
    bundle.mthds
    )
  3. Report the saved file path and show the template content
  4. Offer: "To populate this with realistic test data, re-run /inputs and ask for synthetic data. Or provide your own files."

最快的实现路径。生成占位符
inputs.json
,供用户手动填充。
  1. 提取步骤2输出中的
    inputs
    对象
  2. 将其保存至
    <output_dir>/inputs.json
    (与
    bundle.mthds
    同级)
  3. 报告保存的文件路径并展示模板内容
  4. 提供选项:“如需用真实测试数据填充,请重新运行/inputs并选择合成数据。或提供您自己的文件。”

Synthetic Strategy

合成数据策略

Generate realistic fake data tailored to the method's purpose.
生成符合方法用途的真实模拟数据。

Identify Input Types

识别输入类型

Parse the schema to identify what types of synthetic data are needed:
ConceptContent FieldsSynthesis Method
native.Text
text
Generate realistic text matching the method context
native.Number
number
Generate appropriate numeric values
native.Image
url
,
caption?
,
mime_type?
Use
synthesize_image
pipeline
native.Document
url
,
mime_type?
Use document generation skills or Python
native.Page
text_and_images
,
page_view?
Composite: text + optional images
native.TextAndImages
text?
,
images?
Composite: text + image list
native.JSON
json_obj
Generate structured JSON matching context
Custom structuredPer-field typesRecurse through structure fields
List types (
Type[]
or
Type[N]
): Generate multiple items. Variable lists typically need 2-5 items; fixed lists need exactly N items.
解析Schema以确定所需的合成数据类型:
概念内容字段合成方法
native.Text
text
生成符合方法上下文的真实文本
native.Number
number
生成合适的数值
native.Image
url
,
caption?
,
mime_type?
使用
synthesize_image
管道
native.Document
url
,
mime_type?
使用文档生成技能或Python
native.Page
text_and_images
,
page_view?
复合类型:文本 + 可选图片
native.TextAndImages
text?
,
images?
复合类型:文本 + 图片列表
native.JSON
json_obj
生成符合上下文的结构化JSON
自定义结构化类型按字段类型递归处理结构字段
列表类型 (
Type[]
Type[N]
):生成多个条目。可变列表通常需要2-5个条目;固定列表需要恰好N个条目。

Generate Text Content

生成文本内容

Create realistic text that matches the method's purpose:
  • If the method processes invoices, generate invoice-like text
  • If it analyzes reports, generate report-style content
  • Match expected length (short prompts vs long documents)
创建符合方法用途的真实文本:
  • 若方法处理发票,则生成发票样式的文本
  • 若方法分析报告,则生成报告风格的内容
  • 匹配预期长度(短提示 vs 长文档)

Generate Numeric Content

生成数值内容

Generate sensible values within expected ranges based on the method context.
根据方法上下文生成合理范围内的数值。

Generate Structured Concepts

生成结构化概念

Fill each field according to its type and description.
根据字段类型和描述填充每个字段。

Generate File Inputs

生成文件输入

When inputs require actual files (Image, Document), use the appropriate generation method. See Image Generation and Document Generation below.
当输入需要实际文件(图片、文档)时,使用相应的生成方法。请参阅下文的图片生成文档生成

Assemble and Save

组装并保存

Create the complete
inputs.json
and save to
<output_dir>/inputs.json
(next to
bundle.mthds
). Any generated data files go in
<output_dir>/inputs/
.

创建完整的
inputs.json
并保存至
<output_dir>/inputs.json
(与
bundle.mthds
同级)。所有生成的数据文件保存至
<output_dir>/inputs/

User Data Strategy

用户数据策略

Integrate the user's own files into the method's input schema.
将用户自有文件整合到方法的输入Schema中。

Step A: Inventory User Files

步骤A:盘点用户文件

Collect all files the user has provided (explicit paths, folders, or files mentioned earlier in conversation). For each file, determine its type:
Extension(s)Detected TypeMaps To
.pdf
PDF document
native.Document
(mime:
application/pdf
)
.docx
,
.doc
Word document
native.Document
(mime:
application/vnd.openxmlformats-officedocument.wordprocessingml.document
)
.xlsx
,
.xls
Spreadsheet
native.Document
(mime:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
)
.pptx
,
.ppt
Presentation
native.Document
(mime:
application/vnd.openxmlformats-officedocument.presentationml.presentation
)
.jpg
,
.jpeg
JPEG image
native.Image
(mime:
image/jpeg
)
.png
PNG image
native.Image
(mime:
image/png
)
.webp
WebP image
native.Image
(mime:
image/webp
)
.gif
GIF image
native.Image
(mime:
image/gif
)
.svg
SVG image
native.Image
(mime:
image/svg+xml
)
.tiff
,
.tif
TIFF image
native.Image
(mime:
image/tiff
)
.bmp
BMP image
native.Image
(mime:
image/bmp
)
.txt
Plain text
native.Text
(read file content)
.md
Markdown text
native.Text
(read file content)
.json
JSON data
native.JSON
or custom structured concept
.csv
CSV data
native.Text
(read as text) or
native.JSON
(parse to objects)
.html
,
.htm
HTML
native.Html
收集用户提供的所有文件(明确的路径、文件夹,或对话中提及的文件)。为每个文件确定其类型:
扩展名检测类型映射至
.pdf
PDF文档
native.Document
(mime类型:
application/pdf
.docx
,
.doc
Word文档
native.Document
(mime类型:
application/vnd.openxmlformats-officedocument.wordprocessingml.document
.xlsx
,
.xls
电子表格
native.Document
(mime类型:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.pptx
,
.ppt
演示文稿
native.Document
(mime类型:
application/vnd.openxmlformats-officedocument.presentationml.presentation
.jpg
,
.jpeg
JPEG图片
native.Image
(mime类型:
image/jpeg
.png
PNG图片
native.Image
(mime类型:
image/png
.webp
WebP图片
native.Image
(mime类型:
image/webp
.gif
GIF图片
native.Image
(mime类型:
image/gif
.svg
SVG图片
native.Image
(mime类型:
image/svg+xml
.tiff
,
.tif
TIFF图片
native.Image
(mime类型:
image/tiff
.bmp
BMP图片
native.Image
(mime类型:
image/bmp
.txt
纯文本
native.Text
(读取文件内容)
.md
Markdown文本
native.Text
(读取文件内容)
.json
JSON数据
native.JSON
或自定义结构化概念
.csv
CSV数据
native.Text
(按文本读取)或
native.JSON
(解析为对象)
.html
,
.htm
HTML
native.Html

Step B: Expand Folders

步骤B:展开文件夹

When the user provides a folder path:
  1. List all files in the folder (non-recursive by default, recursive if user requests)
  2. Filter to supported file types
  3. Group files by detected type
  4. Match to list-type inputs (
    Image[]
    ,
    Document[]
    , etc.)
Example: User provides
./invoices/
containing 5 PDFs. The method expects
documents: Document[]
. Map all 5 PDFs to that list input.
当用户提供文件夹路径时:
  1. 列出文件夹中的所有文件(默认非递归,若用户要求则递归)
  2. 筛选出支持的文件类型
  3. 按检测类型分组文件
  4. 匹配至列表类型输入(
    Image[]
    Document[]
    等)
示例:用户提供
./invoices/
,其中包含5个PDF文件。方法需要
documents: Document[]
。将所有5个PDF映射至该列表输入。

Step C: Match Files to Inputs

步骤C:匹配文件与输入

For each input variable in the schema, attempt to match user-provided files:
Matching rules (applied in order):
  1. Exact name match: Input variable
    invoice
    matches a file named
    invoice.pdf
  2. Type match (single candidate): If only one input expects
    native.Image
    and the user provided exactly one image file, match them
  3. Type match (multiple candidates): If multiple inputs of the same type exist:
    • In automatic mode: match by name similarity (variable name vs filename)
    • In interactive mode: ask the user which file goes where
  4. Folder to list: If a folder contains files of a single type and an input expects a list of that type, map the folder contents to that input
  5. Unmatched files: Report them and ask if they should be ignored or mapped to a specific input
  6. Unfilled inputs: After matching, any inputs still without data can be left as placeholders or filled with synthetic data (see Mixed Strategy)
为Schema中的每个输入变量尝试匹配用户提供的文件:
匹配规则(按顺序应用):
  1. 精确名称匹配:输入变量
    invoice
    匹配名为
    invoice.pdf
    的文件
  2. 类型匹配(唯一候选):若仅有一个输入需要
    native.Image
    ,且用户恰好提供了一个图片文件,则匹配二者
  3. 类型匹配(多个候选):若存在多个同类型输入:
    • 自动模式:按名称相似度匹配(变量名 vs 文件名)
    • 交互模式:询问用户文件与哪个输入对应
  4. 文件夹映射至列表:若文件夹包含单一类型的文件,且输入需要该类型的列表,则将文件夹内容映射至该输入
  5. 未匹配文件:报告这些文件并询问是否忽略或映射至特定输入
  6. 未填充输入:匹配完成后,任何未填充的输入可保留为占位符或用合成数据填充(请参阅混合策略

Step D: Copy Files to Output Directory

步骤D:复制文件至输出目录

Copy (or symlink) user files into
<output_dir>/inputs/
so
inputs.json
uses paths relative to the output directory. This keeps the pipeline directory self-contained. Only create the
inputs/
subdirectory if there are actual files to copy.
Use descriptive filenames: if the input variable is
invoice
, copy to
<output_dir>/inputs/invoice.pdf
(preserving original extension).
将用户文件复制(或创建符号链接)至
<output_dir>/inputs/
,使
inputs.json
使用相对于输出目录的路径。这样可保持管道目录的独立性。仅当有实际文件需要复制时,才创建
inputs/
子目录。
使用描述性文件名:若输入变量为
invoice
,则复制至
<output_dir>/inputs/invoice.pdf
(保留原扩展名)。

Step E: Build Content Objects

步骤E:构建内容对象

For each matched file, construct the proper content object:
Document input:
json
{
  "concept": "native.Document",
  "content": {
    "url": "<output_dir>/inputs/invoice.pdf",
    "mime_type": "application/pdf"
  }
}
Image input:
json
{
  "concept": "native.Image",
  "content": {
    "url": "<output_dir>/inputs/photo.jpg",
    "mime_type": "image/jpeg"
  }
}
Text input (from
.txt
or
.md
file — read the file content):
json
{
  "concept": "native.Text",
  "content": {
    "text": "<actual file content read from the .txt/.md file>"
  }
}
Image list input (from folder):
json
{
  "concept": "native.Image",
  "content": [
    {"url": "<output_dir>/inputs/img_001.jpg", "mime_type": "image/jpeg"},
    {"url": "<output_dir>/inputs/img_002.jpg", "mime_type": "image/jpeg"},
    {"url": "<output_dir>/inputs/img_003.png", "mime_type": "image/png"}
  ]
}
为每个匹配的文件构建合适的内容对象:
文档输入:
json
{
  "concept": "native.Document",
  "content": {
    "url": "<output_dir>/inputs/invoice.pdf",
    "mime_type": "application/pdf"
  }
}
图片输入:
json
{
  "concept": "native.Image",
  "content": {
    "url": "<output_dir>/inputs/photo.jpg",
    "mime_type": "image/jpeg"
  }
}
文本输入(来自
.txt
.md
文件 — 读取文件内容):
json
{
  "concept": "native.Text",
  "content": {
    "text": "<从.txt/.md文件读取的实际内容>"
  }
}
图片列表输入(来自文件夹):
json
{
  "concept": "native.Image",
  "content": [
    {"url": "<output_dir>/inputs/img_001.jpg", "mime_type": "image/jpeg"},
    {"url": "<output_dir>/inputs/img_002.jpg", "mime_type": "image/jpeg"},
    {"url": "<output_dir>/inputs/img_003.png", "mime_type": "image/png"}
  ]
}

Step F: Assemble and Save

步骤F:组装并保存

Combine all content objects into a single
inputs.json
and save to
<output_dir>/inputs.json
(next to
bundle.mthds
).
将所有内容对象合并为单个
inputs.json
,并保存至
<output_dir>/inputs.json
(与
bundle.mthds
同级)。

Step G: Report

步骤G:报告

Show the user:
  • Which files were matched to which inputs
  • Any unfilled inputs (offer synthetic or placeholder)
  • The final
    inputs.json
    content
  • Path to the saved file

向用户展示:
  • 哪些文件与哪些输入匹配
  • 任何未填充的输入(提供合成数据或占位符选项)
  • 最终的
    inputs.json
    内容
  • 保存的文件路径

Mixed Strategy

混合策略

Combines user data with synthetic generation for any remaining gaps.
  1. Follow User Data Strategy Steps A-E to match user files
  2. For each unfilled input, apply Synthetic Strategy
  3. Assemble the complete
    inputs.json
    combining both sources
  4. Report which inputs came from user data and which were synthesized

结合用户数据与合成数据,填补剩余空白。
  1. 执行用户数据策略的步骤A-E,匹配用户文件
  2. 对每个未填充的输入,应用合成数据策略
  3. 组装包含两种数据源的完整
    inputs.json
  4. 报告哪些输入来自用户数据,哪些为合成生成

Image Generation

图片生成

Use the
synthesize_image
Pipelex pipeline to generate test images.
Command:
First, create an input file (e.g.,
<output_dir>/image_request.json
):
json
{
  "request": {
    "concept": "synthetic_data.ImageRequest",
    "content": {
      "category": "<category>",
      "description": "<optional description>"
    }
  }
}
Then run:
bash
mthds-agent pipelex run pipe pipelex/builder/synthetic_inputs/synthesize_image.mthds --inputs <output_dir>/image_request.json
Image Categories:
CategoryUse ForExample Description
photograph
Real-world photos, product images, portraits"A professional headshot of a business person"
screenshot
UI mockups, app screens, web pages"A mobile banking app dashboard showing account balance"
chart
Data visualizations, graphs, plots"A bar chart showing quarterly sales by region"
diagram
Technical diagrams, flowcharts, architecture"A system architecture diagram with microservices"
document_scan
Scanned papers, receipts, forms"A scanned invoice from a hardware store"
handwritten
Handwritten notes, signatures"Handwritten meeting notes on lined paper"
Output: The pipeline saves the generated image to
<output_dir>/inputs/
and returns the file path.
For image synthesis error handling, see Error Handling Reference.

使用
synthesize_image
Pipelex管道生成测试图片。
命令:
首先,创建输入文件(例如
<output_dir>/image_request.json
):
json
{
  "request": {
    "concept": "synthetic_data.ImageRequest",
    "content": {
      "category": "<category>",
      "description": "<可选描述>"
    }
  }
}
然后运行:
bash
mthds-agent pipelex run pipe pipelex/builder/synthetic_inputs/synthesize_image.mthds --inputs <output_dir>/image_request.json
图片类别:
类别适用场景示例描述
photograph
真实照片、产品图片、肖像"商务人士的专业头像"
screenshot
UI原型、应用界面、网页"显示账户余额的移动银行应用仪表盘"
chart
数据可视化、图表、图形"按区域划分的季度销售柱状图"
diagram
技术图表、流程图、架构图"包含微服务的系统架构图"
document_scan
扫描文档、收据、表单"五金店的扫描发票"
handwritten
手写笔记、签名"写在横格纸上的手写会议笔记"
输出:管道将生成的图片保存至
<output_dir>/inputs/
并返回文件路径。
有关图片合成的错误处理,请参阅错误处理参考

Document Generation

文档生成

Generate test documents based on the document type needed.
根据所需的文档类型生成测试文档。

PDF Documents

PDF文档

reportlab
is a dependency of
pipelex
— always available, no additional installation needed.
reportlab
pipelex
的依赖项 — 始终可用,无需额外安装。

Basic PDF (Canvas API)

基础PDF(Canvas API)

python
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

c = canvas.Canvas("<output_dir>/inputs/test_document.pdf", pagesize=letter)
width, height = letter
python
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

c = canvas.Canvas("<output_dir>/inputs/test_document.pdf", pagesize=letter)
width, height = letter

Add text

添加文本

c.drawString(100, height - 100, "Hello World!") c.drawString(100, height - 120, "This is a PDF created with reportlab")
c.drawString(100, height - 100, "Hello World!") c.drawString(100, height - 120, "This is a PDF created with reportlab")

Add a line

添加线条

c.line(100, height - 140, 400, height - 140)
c.line(100, height - 140, 400, height - 140)

Save

保存

c.save()
undefined
c.save()
undefined

Multi-Page PDF (Platypus)

多页PDF(Platypus)

python
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import getSampleStyleSheet

doc = SimpleDocTemplate("<output_dir>/inputs/test_report.pdf", pagesize=letter)
styles = getSampleStyleSheet()
story = []
python
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import getSampleStyleSheet

doc = SimpleDocTemplate("<output_dir>/inputs/test_report.pdf", pagesize=letter)
styles = getSampleStyleSheet()
story = []

Add content

添加内容

title = Paragraph("Report Title", styles['Title']) story.append(title) story.append(Spacer(1, 12))
body = Paragraph("This is the body of the report. " * 20, styles['Normal']) story.append(body) story.append(PageBreak())
title = Paragraph("Report Title", styles['Title']) story.append(title) story.append(Spacer(1, 12))
body = Paragraph("This is the body of the report. " * 20, styles['Normal']) story.append(body) story.append(PageBreak())

Page 2

第2页

story.append(Paragraph("Page 2", styles['Heading1'])) story.append(Paragraph("Content for page 2", styles['Normal']))
story.append(Paragraph("Page 2", styles['Heading1'])) story.append(Paragraph("Content for page 2", styles['Normal']))

Build PDF

构建PDF

doc.build(story)
undefined
doc.build(story)
undefined

Professional Reports with Tables

带表格的专业报告

python
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib import colors
python
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib import colors

Sample data

示例数据

data = [ ['Product', 'Q1', 'Q2', 'Q3', 'Q4'], ['Widgets', '120', '135', '142', '158'], ['Gadgets', '85', '92', '98', '105'] ]
data = [ ['Product', 'Q1', 'Q2', 'Q3', 'Q4'], ['Widgets', '120', '135', '142', '158'], ['Gadgets', '85', '92', '98', '105'] ]

Create PDF with table

创建带表格的PDF

doc = SimpleDocTemplate("<output_dir>/inputs/test_report.pdf") elements = []
doc = SimpleDocTemplate("<output_dir>/inputs/test_report.pdf") elements = []

Add title

添加标题

styles = getSampleStyleSheet() title = Paragraph("Quarterly Sales Report", styles['Title']) elements.append(title)
styles = getSampleStyleSheet() title = Paragraph("Quarterly Sales Report", styles['Title']) elements.append(title)

Add table with advanced styling

添加带高级样式的表格

table = Table(data) table.setStyle(TableStyle([ ('BACKGROUND', (0, 0), (-1, 0), colors.grey), ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke), ('ALIGN', (0, 0), (-1, -1), 'CENTER'), ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'), ('FONTSIZE', (0, 0), (-1, 0), 14), ('BOTTOMPADDING', (0, 0), (-1, 0), 12), ('BACKGROUND', (0, 1), (-1, -1), colors.beige), ('GRID', (0, 0), (-1, -1), 1, colors.black) ])) elements.append(table)
doc.build(elements)

**Last resort** — use a public test PDF URL:
```json
{
  "url": "https://www.w3.org/WAI/WCAG21/Techniques/pdf/img/table-word.pdf",
  "mime_type": "application/pdf"
}
table = Table(data) table.setStyle(TableStyle([ ('BACKGROUND', (0, 0), (-1, 0), colors.grey), ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke), ('ALIGN', (0, 0), (-1, -1), 'CENTER'), ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'), ('FONTSIZE', (0, 0), (-1, 0), 14), ('BOTTOMPADDING', (0, 0), (-1, 0), 12), ('BACKGROUND', (0, 1), (-1, -1), colors.beige), ('GRID', (0, 0), (-1, -1), 1, colors.black) ])) elements.append(table)
doc.build(elements)

**最后手段** — 使用公开测试PDF URL:
```json
{
  "url": "https://www.w3.org/WAI/WCAG21/Techniques/pdf/img/table-word.pdf",
  "mime_type": "application/pdf"
}

Word Documents (DOCX)

Word文档(DOCX)

If
example-skills:docx
skill is available:
Use the /docx skill to create a Word document with the following content:
[Describe the document content, structure, and formatting]
Save to: <output_dir>/inputs/<filename>.docx
If skill is NOT available, create using Python:
python
undefined
example-skills:docx
技能可用:
使用/docx技能创建包含以下内容的Word文档:
[描述文档内容、结构和格式]
保存至:<output_dir>/inputs/<filename>.docx
若技能不可用,使用Python创建:
python
undefined

Requires: pip install python-docx

需安装:pip install python-docx

from docx import Document
doc = Document() doc.add_heading('Test Document', 0) doc.add_paragraph('This is synthetic test content for method testing.')
from docx import Document
doc = Document() doc.add_heading('Test Document', 0) doc.add_paragraph('This is synthetic test content for method testing.')

Add more content as needed

根据需要添加更多内容

doc.save('<output_dir>/inputs/test_document.docx')
undefined
doc.save('<output_dir>/inputs/test_document.docx')
undefined

Spreadsheets (XLSX)

电子表格(XLSX)

If
example-skills:xlsx
skill is available:
Use the /xlsx skill to create a spreadsheet with the following data:
[Describe columns, rows, and sample data]
Save to: <output_dir>/inputs/<filename>.xlsx
If skill is NOT available, create using Python:
python
undefined
example-skills:xlsx
技能可用:
使用/xlsx技能创建包含以下数据的电子表格:
[描述列、行和示例数据]
保存至:<output_dir>/inputs/<filename>.xlsx
若技能不可用,使用Python创建:
python
undefined

Requires: pip install openpyxl

需安装:pip install openpyxl

from openpyxl import Workbook
wb = Workbook() ws = wb.active ws['A1'] = 'Column1' ws['B1'] = 'Column2' ws['A2'] = 'Value1' ws['B2'] = 'Value2' wb.save('<output_dir>/inputs/test_spreadsheet.xlsx')

---

**Fallback Strategy:**
1. For PDFs: use `reportlab` (always available via pipelex) with the patterns above
2. For DOCX/XLSX: use the `/docx` or `/xlsx` skill (install from the Anthropic example-skills marketplace if not available)
3. For any format: use public test file URLs as fallback
4. As last resort, ask user to provide test files

---
from openpyxl import Workbook
wb = Workbook() ws = wb.active ws['A1'] = 'Column1' ws['B1'] = 'Column2' ws['A2'] = 'Value1' ws['B2'] = 'Value2' wb.save('<output_dir>/inputs/test_spreadsheet.xlsx')

---

** fallback策略:**
1. 对于PDF:使用`reportlab`(通过pipelex始终可用),采用上述模板
2. 对于DOCX/XLSX:使用`/docx`或`/xlsx`技能(若不可用,从Anthropic示例技能市场安装)
3. 对于任何格式:使用公开测试文件URL作为备选
4. 最后手段:请求用户提供测试文件

---

Validate & Run

验证与运行

After assembling the inputs, confirm readiness:
Inputs are ready.
inputs.json
has been saved with real values — no placeholders remain.
Then offer to run:
bash
undefined
组装输入后,确认准备就绪:
输入已准备完成。
inputs.json
已保存真实值 — 无占位符剩余。
随后提供运行选项:
bash
undefined

Dry run with the prepared inputs (directory mode auto-detects bundle, inputs, library dir)

使用准备好的输入进行试运行(目录模式自动检测包、输入和库目录)

mthds-agent pipelex run pipe <bundle-dir>/ --dry-run
mthds-agent pipelex run pipe <bundle-dir>/ --dry-run

Full run (uses actual AI/extraction models)

完整运行(使用实际AI/提取模型)

mthds-agent pipelex run pipe <bundle-dir>/

---
mthds-agent pipelex run pipe <bundle-dir>/

---

Native Concept Content Structures

原生概念内容结构

Text

Text

json
{"text": "The actual text content"}
json
{"text": "实际文本内容"}

Number

Number

json
{"number": 42}
json
{"number": 42}

Image

Image

json
{
  "url": "/path/to/image.jpg",
  "caption": "Optional description",
  "mime_type": "image/jpeg"
}
json
{
  "url": "/path/to/image.jpg",
  "caption": "可选描述",
  "mime_type": "image/jpeg"
}

Document

Document

json
{
  "url": "/path/to/document.pdf",
  "mime_type": "application/pdf"
}
json
{
  "url": "/path/to/document.pdf",
  "mime_type": "application/pdf"
}

TextAndImages

TextAndImages

json
{
  "text": {"text": "Main text content"},
  "images": [
    {"url": "/path/to/img1.png", "caption": "Figure 1"}
  ]
}
json
{
  "text": {"text": "主要文本内容"},
  "images": [
    {"url": "/path/to/img1.png", "caption": "图1"}
  ]
}

Page

Page

json
{
  "text_and_images": {
    "text": {"text": "Page content..."},
    "images": []
  },
  "page_view": null
}
json
{
  "text_and_images": {
    "text": {"text": "页面内容..."},
    "images": []
  },
  "page_view": null
}

JSON

JSON

json
{"json_obj": {"key": "value", "nested": {"data": 123}}}

json
{"json_obj": {"key": "value", "nested": {"data": 123}}}

Complete Examples

完整示例

Example 1: Template for a Haiku writer

示例1:俳句生成器的模板

Method: Haiku pipeline expecting
theme: Text
bash
mthds-agent pipelex inputs mthds-wip/pipeline_01/bundle.mthds -L mthds-wip/pipeline_01/
Save the
inputs
from the output directly to
mthds-wip/pipeline_01/inputs.json
.
方法:需要
theme: Text
的俳句管道
bash
mthds-agent pipelex inputs mthds-wip/pipeline_01/bundle.mthds -L mthds-wip/pipeline_01/
将输出中的
inputs
直接保存至
mthds-wip/pipeline_01/inputs.json

Example 2: Synthetic data for an image analysis pipeline

示例2:图片分析管道的合成数据

Method: Image analyzer expecting
image: Image
and
analysis_prompt: Text
  1. Get schema, identify needs: test photograph + instruction text
  2. Generate image via
    synthesize_image.mthds
    with category
    photograph
  3. Write analysis prompt text matching the method context
  4. Assemble:
json
{
  "image": {
    "concept": "native.Image",
    "content": {
      "url": "mthds-wip/pipeline_01/inputs/city_street.jpg",
      "mime_type": "image/jpeg"
    }
  },
  "analysis_prompt": {
    "concept": "native.Text",
    "content": {
      "text": "Analyze this street scene. Count visible people and describe the atmosphere."
    }
  }
}
方法:需要
image: Image
analysis_prompt: Text
的图片分析器
  1. 获取Schema,确定需求:测试图片 + 指令文本
  2. 使用
    synthesize_image.mthds
    生成
    photograph
    类别的图片
  3. 编写符合方法上下文的分析提示文本
  4. 组装:
json
{
  "image": {
    "concept": "native.Image",
    "content": {
      "url": "mthds-wip/pipeline_01/inputs/city_street.jpg",
      "mime_type": "image/jpeg"
    }
  },
  "analysis_prompt": {
    "concept": "native.Text",
    "content": {
      "text": "分析此街景。统计可见人数并描述氛围。"
    }
  }
}

Example 3: User-provided invoice PDF

示例3:用户提供的发票PDF

Method: Invoice processor expecting
invoice: Document
and
instructions: Text
User says: "Use my file
~/documents/invoice_march.pdf
"
  1. Get schema: needs
    invoice
    (Document) +
    instructions
    (Text)
  2. Inventory: user provided
    invoice_march.pdf
    (PDF = Document type)
  3. Match:
    invoice_march.pdf
    maps to
    invoice
    input (name similarity + type match)
  4. Copy:
    cp ~/documents/invoice_march.pdf mthds-wip/pipeline_01/inputs/invoice.pdf
  5. Unfilled:
    instructions
    has no user file. Generate synthetic text: "Extract all line items, totals, and vendor information from this invoice."
  6. Assemble:
json
{
  "invoice": {
    "concept": "native.Document",
    "content": {
      "url": "mthds-wip/pipeline_01/inputs/invoice.pdf",
      "mime_type": "application/pdf"
    }
  },
  "instructions": {
    "concept": "native.Text",
    "content": {
      "text": "Extract all line items, totals, and vendor information from this invoice."
    }
  }
}
方法:需要
invoice: Document
instructions: Text
的发票处理器
用户说:“使用我的文件
~/documents/invoice_march.pdf
  1. 获取Schema:需要
    invoice
    (Document) +
    instructions
    (Text)
  2. 盘点:用户提供
    invoice_march.pdf
    (PDF = Document类型)
  3. 匹配:
    invoice_march.pdf
    映射至
    invoice
    输入(名称相似度 + 类型匹配)
  4. 复制:
    cp ~/documents/invoice_march.pdf mthds-wip/pipeline_01/inputs/invoice.pdf
  5. 未填充:
    instructions
    无用户文件。生成合成文本:“提取此发票中的所有行项目、总计和供应商信息。”
  6. 组装:
json
{
  "invoice": {
    "concept": "native.Document",
    "content": {
      "url": "mthds-wip/pipeline_01/inputs/invoice.pdf",
      "mime_type": "application/pdf"
    }
  },
  "instructions": {
    "concept": "native.Text",
    "content": {
      "text": "提取此发票中的所有行项目、总计和供应商信息。"
    }
  }
}

Example 4: Folder of images for batch processing

示例4:用于批量处理的图片文件夹

Method: Batch image captioner expecting
images: Image[]
User says: "Use the photos in
./product-photos/
"
  1. Get schema: needs
    images
    (Image[])
  2. Expand folder:
    ./product-photos/
    contains
    shoe.jpg
    ,
    hat.png
    ,
    bag.jpg
  3. Copy all to
    <output_dir>/inputs/
  4. Assemble:
json
{
  "images": {
    "concept": "native.Image",
    "content": [
      {"url": "mthds-wip/pipeline_01/inputs/shoe.jpg", "mime_type": "image/jpeg"},
      {"url": "mthds-wip/pipeline_01/inputs/hat.png", "mime_type": "image/png"},
      {"url": "mthds-wip/pipeline_01/inputs/bag.jpg", "mime_type": "image/jpeg"}
    ]
  }
}

方法:需要
images: Image[]
的批量图片标注器
用户说:“使用
./product-photos/
中的照片”
  1. 获取Schema:需要
    images
    (Image[])
  2. 展开文件夹:
    ./product-photos/
    包含
    shoe.jpg
    hat.png
    bag.jpg
  3. 将所有文件复制至
    <output_dir>/inputs/
  4. 组装:
json
{
  "images": {
    "concept": "native.Image",
    "content": [
      {"url": "mthds-wip/pipeline_01/inputs/shoe.jpg", "mime_type": "image/jpeg"},
      {"url": "mthds-wip/pipeline_01/inputs/hat.png", "mime_type": "image/png"},
      {"url": "mthds-wip/pipeline_01/inputs/bag.jpg", "mime_type": "image/jpeg"}
    ]
  }
}

Reference

参考

  • CLI Prerequisites — read at skill start to check CLI availability
  • Error Handling — read when CLI returns an error to determine recovery
  • MTHDS Agent Guide — read for CLI command syntax or output format details
  • MTHDS Language Reference — read for concept definitions and syntax
  • Native Content Types — read for the full attribute reference of each native content type when assembling input JSON
  • CLI前置条件 — 技能启动时读取,检查CLI可用性
  • 错误处理 — 当CLI返回错误时读取,确定恢复方式
  • MTHDS Agent指南 — 读取CLI命令语法或输出格式细节
  • MTHDS语言参考 — 读取概念定义和语法
  • 原生内容类型 — 组装输入JSON时,读取每个原生内容类型的完整属性参考