generate-import-html
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGenerate Import HTML
生成导入用HTML
Create plain HTML file with block structure from authoring analysis.
根据创作分析创建带有块结构的纯HTML文件。
When to Use This Skill
何时使用此技能
Use this skill when:
- You have complete authoring analysis (all sequences have decisions)
- You have section styling validation (from authoring-analysis)
- Ready to generate the HTML file for preview
Invoked by: page-import skill (Step 4)
在以下场景使用此技能:
- 您已完成创作分析(所有序列均已确定)
- 您已完成章节样式验证(来自authoring-analysis)
- 准备好生成用于预览的HTML文件
调用方: page-import 技能(步骤4)
Prerequisites
前置条件
From previous skills, you need:
- ✅ Authoring analysis with block selections (from authoring-analysis)
- ✅ Section styling decisions (from authoring-analysis Step 3e)
- ✅ metadata.json with paths and metadata (from scrape-webpage)
- ✅ cleaned.html with content (from scrape-webpage)
- ✅ Block structures fetched (from authoring-analysis Step 3d)
您需要来自之前技能的以下内容:
- ✅ 包含块选择的创作分析结果(来自authoring-analysis)
- ✅ 章节样式决策(来自authoring-analysis步骤3e)
- ✅ 包含路径和元数据的metadata.json(来自scrape-webpage)
- ✅ 包含内容的cleaned.html(来自scrape-webpage)
- ✅ 获取到的块结构(来自authoring-analysis步骤3d)
Related Skills
相关技能
- page-import - Orchestrator that invokes this skill
- authoring-analysis - Provides authoring decisions and styling validation
- scrape-webpage - Provides metadata, paths, cleaned HTML, images
- preview-import - Uses this skill's HTML output
- page-import - 调用此技能的编排器
- authoring-analysis - 提供创作决策和样式验证结果
- scrape-webpage - 提供元数据、路径、清理后的HTML和图片
- preview-import - 使用此技能生成的HTML输出
⚠️ CRITICAL REQUIREMENT: Complete Content Import
⚠️ 关键要求:完整内容导入
YOU MUST IMPORT ALL CONTENT FROM THE PAGE. PARTIAL IMPORT IS UNACCEPTABLE.
- ❌ NEVER truncate or skip sections due to length concerns
- ❌ NEVER summarize or abbreviate content
- ❌ NEVER use placeholders like "<!-- rest of content -->"
- ❌ NEVER omit content because the page is "too long"
- ✅ ALWAYS import every section from authoring analysis
- ✅ ALWAYS include all text, images, and structure from cleaned.html
- ✅ If you encounter length issues, generate the FULL HTML anyway
Validation requirement: You MUST verify that the number of sections in your HTML matches the number of sections from identify-page-structure. If they don't match, you have made an error.
您必须导入页面的所有内容。部分导入是不允许的。
- ❌ 绝不要因长度问题截断或跳过章节
- ❌ 绝不要总结或缩写内容
- ❌ 绝不要使用类似"<!-- rest of content -->"的占位符
- ❌ 绝不要因页面“过长”而省略内容
- ✅ 始终导入创作分析中的所有章节
- ✅ 始终包含cleaned.html中的所有文本、图片和结构
- ✅ 如果遇到长度问题,无论如何都要生成完整的HTML
验证要求: 您必须验证HTML中的章节数量与identify-page-structure识别的章节数量一致。如果不一致,说明您出现了错误。
HTML Generation Workflow
HTML生成流程
Structure Requirements
结构要求
IMPORTANT CHANGE: The AEM CLI now automatically wraps HTML content with headful structure (head, header, footer). You MUST generate ONLY the section content.
What to generate:
- ✅ Section divs with content: (one per section)
<div>...</div> - ✅ Blocks as with nested divs
<div class="block-name"> - ✅ Default content (headings, paragraphs, links, images)
- ✅ Section metadata blocks where validated in authoring-analysis
What NOT to generate:
- ❌ NO ,
<html>, or<head>tags<body> - ❌ NO or
<header>elements<footer> - ❌ NO wrapper element
<main> - ❌ NO head content (meta tags, title, etc. - this comes from project's head.html)
Structure format:
html
<div>
<!-- Section 1 content -->
</div>
<div>
<!-- Section 2 content with section-metadata if needed -->
<div class="section-metadata">
<div>
<div>Style</div>
<div>grey</div>
</div>
</div>
<!-- Section 2 blocks/content -->
</div>
<div>
<!-- Section 3 content -->
</div>For detailed block structure patterns: See
../page-import/resources/html-structure.md重要变更: AEM CLI现在会自动为HTML内容添加完整的头部结构(head、header、footer)。您只需生成章节内容即可。
需要生成的内容:
- ✅ 包含内容的章节div:(每个章节一个)
<div>...</div> - ✅ 块以形式呈现,并包含嵌套div
<div class="block-name"> - ✅ 默认内容(标题、段落、链接、图片)
- ✅ 在创作分析中验证过的章节元数据块
禁止生成的内容:
- ❌ 不要使用、
<html>或<head>标签<body> - ❌ 不要使用或
<header>元素<footer> - ❌ 不要使用包裹元素
<main> - ❌ 不要添加头部内容(元标签、标题等 - 这些来自项目的head.html)
结构格式:
html
<div>
<!-- Section 1 content -->
</div>
<div>
<!-- Section 2 content with section-metadata if needed -->
<div class="section-metadata">
<div>
<div>Style</div>
<div>grey</div>
</div>
</div>
<!-- Section 2 blocks/content -->
</div>
<div>
<!-- Section 3 content -->
</div>详细块结构模式: 请参阅
../page-import/resources/html-structure.mdSection Metadata Application
章节元数据应用
Apply validated decisions from authoring-analysis Step 3e:
WITH section-metadata (section provides container styling):
html
<div>
<div class="section-metadata">
<div>
<div>Style</div>
<div>dark</div>
</div>
</div>
<div class="tabs">
<!-- Tabs block content -->
</div>
</div>WITHOUT section-metadata (background is block-specific):
html
<div>
<div class="hero">
<!-- Hero block content with its own dark background -->
</div>
</div>Important:
- Only migrate visible body content sections (skip header, navigation, footer - auto-generated)
- Use consistent style names from identify-page-structure
- Apply validated decisions from authoring-analysis Step 3e - Skip section-metadata for single-block sections where background is block-specific
- Place div at the start of each section that needs styling
section-metadata - The metadata div will be processed and removed by the platform
- Each section is a separate top-level element
<div>
应用来自authoring-analysis步骤3e的已验证决策:
包含section-metadata(章节提供容器样式):
html
<div>
<div class="section-metadata">
<div>
<div>Style</div>
<div>dark</div>
</div>
</div>
<div class="tabs">
<!-- Tabs block content -->
</div>
</div>不包含section-metadata(背景由块自身定义):
html
<div>
<div class="hero">
<!-- Hero block content with its own dark background -->
</div>
</div>注意事项:
- 仅迁移可见的正文内容章节(跳过头部、导航、页脚 - 这些会自动生成)
- 使用来自identify-page-structure的统一样式名称
- 应用来自authoring-analysis步骤3e的已验证决策 - 对于背景由块自身定义的单块章节,跳过section-metadata
- 将div放在每个需要样式的章节的开头
section-metadata - 元数据div会被平台处理并移除
- 每个章节是独立的顶级元素
<div>
Page Metadata Block
页面元数据块
Unless user explicitly requested to skip metadata, use the metadata extracted from scrape-webpage to generate a metadata block.
Process:
1. Review extracted metadata from metadata.json
2. Map each property to standard format:
Title:
- Compare source (or
title) with first H1 on pageog:title - If matches first H1 → Omit (platform defaults to H1)
- If differs → Include as property
title
Description:
- Compare source (or
description) with first paragraphog:description - If matches first paragraph → Consider omitting (platform defaults to first paragraph)
- If differs OR more descriptive → Include as property
description - Check: 150-160 characters ideal
Image:
- Check source
og:image - If matches first content image → Consider omitting (platform defaults to first image)
- If custom social image → Include as property
image - Ensure absolute URL or correct relative path
- Check: 1200x630 pixels recommended
Canonical:
- If points to same page URL → Omit (platform auto-generates)
- If points to different page → Include as property
canonical
Tags:
- Map or
article:tag→ comma-separatedkeywordspropertytags
Properties to SKIP (platform auto-populates):
- ,
og:url,og:title,og:description,twitter:title,twitter:descriptiontwitter:image - ,
viewport,charset(belong in head.html)X-UA-Compatible
3. Generate metadata block HTML:
html
<div>
<div class="metadata">
<div>
<div>title</div>
<div>[Your mapped title]</div>
</div>
<div>
<div>description</div>
<div>[Your mapped description]</div>
</div>
<!-- Only include image if custom -->
<!-- Only include canonical if differs from page URL -->
<!-- Only include tags if present -->
</div>
</div>Append metadata block as the last section div at the end of the HTML file.
Detailed guidance: See and
resources/metadata-extraction.mdresources/metadata-mapping.md除非用户明确要求跳过元数据,否则使用来自scrape-webpage提取的元数据生成元数据块。
流程:
1. 查看来自metadata.json的提取元数据
2. 将每个属性映射为标准格式:
标题:
- 比较源(或
title)与页面上的第一个H1og:title - 如果与第一个H1匹配 → 省略(平台默认使用H1)
- 如果不同 → 作为属性包含
title
描述:
- 比较源(或
description)与第一段内容og:description - 如果与第一段匹配 → 考虑省略(平台默认使用第一段)
- 如果不同或更具描述性 → 作为属性包含
description - 检查:理想长度为150-160字符
图片:
- 检查源
og:image - 如果与第一个内容图片匹配 → 考虑省略(平台默认使用第一张图片)
- 如果是自定义社交图片 → 作为属性包含
image - 确保使用绝对URL或正确的相对路径
- 检查:建议尺寸为1200x630像素
规范链接:
- 如果指向同一页面URL → 省略(平台自动生成)
- 如果指向不同页面 → 作为属性包含
canonical
标签:
- 映射或
article:tag→ 以逗号分隔的keywords属性tags
需要跳过的属性(平台自动填充):
- ,
og:url,og:title,og:description,twitter:title,twitter:descriptiontwitter:image - ,
viewport,charset(属于head.html)X-UA-Compatible
3. 生成元数据块HTML:
html
<div>
<div class="metadata">
<div>
<div>title</div>
<div>[您的映射标题]</div>
</div>
<div>
<div>description</div>
<div>[您的映射描述]</div>
</div>
<!-- 仅在自定义时包含图片 -->
<!-- 仅当与页面URL不同时包含规范链接 -->
<!-- 仅在存在时包含标签 -->
</div>
</div>将元数据块作为最后一个章节div追加到HTML文件末尾。
详细指南: 请参阅和
resources/metadata-extraction.mdresources/metadata-mapping.mdImages Folder Management (CRITICAL)
图片文件夹管理(关键)
The images are currently in and the HTML references them as . You MUST handle the images folder correctly:
./import-work/images/./images/...Step 1: Determine the correct images folder location
Based on from metadata.json:
paths.htmlFilePath- HTML file: → Images should be at:
us/en/about.plain.htmlus/en/images/ - HTML file: → Images should be at:
products/widget.plain.htmlproducts/images/ - HTML file: → Images should be at:
index.plain.htmlimages/
Rule: Images folder goes in the same directory as the HTML file.
Step 2: Copy the images folder
bash
undefined图片当前位于,HTML中引用为。您必须正确处理图片文件夹:
./import-work/images/./images/...步骤1:确定正确的图片文件夹位置
根据metadata.json中的:
paths.htmlFilePath- HTML文件:→ 图片应位于:
us/en/about.plain.htmlus/en/images/ - HTML文件:→ 图片应位于:
products/widget.plain.htmlproducts/images/ - HTML文件:→ 图片应位于:
index.plain.htmlimages/
规则: 图片文件夹与HTML文件位于同一目录。
步骤2:复制图片文件夹
bash
undefinedExample: If HTML is at us/en/about.plain.html
示例:如果HTML位于us/en/about.plain.html
mkdir -p us/en/images
cp -r ./import-work/images/* us/en/images/
**Step 3: Verify image paths in HTML are correct**
The HTML should already reference images as `./images/...` which is correct for files in the same directory. No path changes needed in the HTML.
**Example:**HTML location: us/en/about.plain.html
Images location: us/en/images/
Image reference in HTML: <img src="./images/abc123.jpg">
Result: ✅ Correct - browser resolves to us/en/images/abc123.jpg
---mkdir -p us/en/images
cp -r ./import-work/images/* us/en/images/
**步骤3:验证HTML中的图片路径是否正确**
HTML中已将图片引用为`./images/...`,这对于同一目录下的文件是正确的。无需修改HTML中的路径。
**示例:**HTML位置:us/en/about.plain.html
图片位置:us/en/images/
HTML中的图片引用:<img src="./images/abc123.jpg">
结果:✅ 正确 - 浏览器会解析为us/en/images/abc123.jpg
---Save HTML File
保存HTML文件
Save to: Use from metadata.json (e.g., )
paths.htmlFilePathus/en/about.plain.htmlRead the metadata.json file from scrape-webpage to get the correct file path.
保存路径: 使用metadata.json中的(例如)
paths.htmlFilePathus/en/about.plain.html从scrape-webpage获取metadata.json文件以获取正确的文件路径。
Validation Checklist (MANDATORY)
验证清单(必填)
Before proceeding to preview-import skill, verify:
- ✅ Section count: HTML has the same number of top-level sections as identified in identify-page-structure
<div> - ✅ All sequences: Every content sequence from authoring-analysis appears in the HTML
- ✅ No truncation: No "..." or "<!-- more content -->" or similar placeholders
- ✅ Complete text: All headings, paragraphs, and text from cleaned.html are present
- ✅ All images: Every image reference from the scraped page is included
- ✅ HTML file saved: HTML file written to disk at the correct path
- ✅ Images folder copied: Images folder exists in the same directory as the HTML file
- ✅ Images accessible: Verify that at least one image file exists in the copied images folder
If any validation check fails, STOP and fix before proceeding.
在进入preview-import技能之前,请验证:
- ✅ 章节数量:HTML中的顶级章节数量与identify-page-structure中识别的数量一致
<div> - ✅ 所有序列:来自authoring-analysis的每个内容序列都出现在HTML中
- ✅ 无截断:没有“...”或“<!-- more content -->”或类似占位符
- ✅ 文本完整:包含cleaned.html中的所有标题、段落和文本
- ✅ 图片完整:包含来自抓取页面的所有图片引用
- ✅ HTML文件已保存:HTML文件已写入磁盘的正确路径
- ✅ 图片文件夹已复制:图片文件夹与HTML文件位于同一目录
- ✅ 图片可访问:验证复制的图片文件夹中至少存在一个图片文件
如果任何验证检查失败,请停止并修复后再继续。
Output
输出结果
This skill provides:
- ✅ HTML file at correct path (e.g., )
us/en/about.plain.html - ✅ Images folder in same directory (e.g., )
us/en/images/ - ✅ Complete content import (all sections)
- ✅ Proper block structure
- ✅ Section metadata applied per validation
- ✅ Page metadata block included
Next step: Pass HTML file path to preview-import skill
此技能提供以下内容:
- ✅ 位于正确路径的HTML文件(例如)
us/en/about.plain.html - ✅ 与HTML文件同目录的图片文件夹(例如)
us/en/images/ - ✅ 完整的内容导入(所有章节)
- ✅ 正确的块结构
- ✅ 已根据验证结果应用章节元数据
- ✅ 包含页面元数据块
下一步: 将HTML文件路径传递给preview-import技能