generate-import-html

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Generate Import HTML

生成导入用HTML

Create plain HTML file with block structure from authoring analysis.

根据创作分析创建带有块结构的纯HTML文件。

When to Use This Skill

何时使用此技能

Use this skill when:

You have complete authoring analysis (all sequences have decisions)
You have section styling validation (from authoring-analysis)
Ready to generate the HTML file for preview

Invoked by: page-import skill (Step 4)

在以下场景使用此技能：

您已完成创作分析（所有序列均已确定）
您已完成章节样式验证（来自authoring-analysis）
准备好生成用于预览的HTML文件

调用方： page-import 技能（步骤4）

Prerequisites

前置条件

From previous skills, you need:

✅ Authoring analysis with block selections (from authoring-analysis)
✅ Section styling decisions (from authoring-analysis Step 3e)
✅ metadata.json with paths and metadata (from scrape-webpage)
✅ cleaned.html with content (from scrape-webpage)
✅ Block structures fetched (from authoring-analysis Step 3d)

您需要来自之前技能的以下内容：

✅ 包含块选择的创作分析结果（来自authoring-analysis）
✅ 章节样式决策（来自authoring-analysis步骤3e）
✅ 包含路径和元数据的metadata.json（来自scrape-webpage）
✅ 包含内容的cleaned.html（来自scrape-webpage）
✅ 获取到的块结构（来自authoring-analysis步骤3d）

Related Skills

⚠️ CRITICAL REQUIREMENT: Complete Content Import

⚠️ 关键要求：完整内容导入

YOU MUST IMPORT ALL CONTENT FROM THE PAGE. PARTIAL IMPORT IS UNACCEPTABLE.

❌ NEVER truncate or skip sections due to length concerns
❌ NEVER summarize or abbreviate content
❌ NEVER use placeholders like ""
❌ NEVER omit content because the page is "too long"
✅ ALWAYS import every section from authoring analysis
✅ ALWAYS include all text, images, and structure from cleaned.html
✅ If you encounter length issues, generate the FULL HTML anyway

Validation requirement: You MUST verify that the number of sections in your HTML matches the number of sections from identify-page-structure. If they don't match, you have made an error.

您必须导入页面的所有内容。部分导入是不允许的。

❌ 绝不要因长度问题截断或跳过章节
❌ 绝不要总结或缩写内容
❌ 绝不要使用类似""的占位符
❌ 绝不要因页面“过长”而省略内容
✅ 始终导入创作分析中的所有章节
✅ 始终包含cleaned.html中的所有文本、图片和结构
✅ 如果遇到长度问题，无论如何都要生成完整的HTML

验证要求： 您必须验证HTML中的章节数量与identify-page-structure识别的章节数量一致。如果不一致，说明您出现了错误。

HTML Generation Workflow

HTML生成流程

Structure Requirements

结构要求

IMPORTANT CHANGE: The AEM CLI now automatically wraps HTML content with headful structure (head, header, footer). You MUST generate ONLY the section content.

What to generate:

✅ Section divs with content:
```
<div>...</div>
```
(one per section)
✅ Blocks as
```
<div class="block-name">
```
with nested divs
✅ Default content (headings, paragraphs, links, images)
✅ Section metadata blocks where validated in authoring-analysis

What NOT to generate:

❌ NO
```
<html>
```
,
```
<head>
```
, or
```
<body>
```
tags
❌ NO
```
<header>
```
or
```
<footer>
```
elements
❌ NO
```
<main>
```
wrapper element
❌ NO head content (meta tags, title, etc. - this comes from project's head.html)

Structure format:

html

<div>
  <!-- Section 1 content -->
</div>
<div>
  <!-- Section 2 content with section-metadata if needed -->
  <div class="section-metadata">
    <div>
      <div>Style</div>
      <div>grey</div>
    </div>
  </div>
  <!-- Section 2 blocks/content -->
</div>
<div>
  <!-- Section 3 content -->
</div>

For detailed block structure patterns: See

../page-import/resources/html-structure.md

重要变更： AEM CLI现在会自动为HTML内容添加完整的头部结构（head、header、footer）。您只需生成章节内容即可。

需要生成的内容：

✅ 包含内容的章节div：
```
<div>...</div>
```
（每个章节一个）
✅ 块以
```
<div class="block-name">
```
形式呈现，并包含嵌套div
✅ 默认内容（标题、段落、链接、图片）
✅ 在创作分析中验证过的章节元数据块

禁止生成的内容：

❌ 不要使用
```
<html>
```
、
```
<head>
```
或
```
<body>
```
标签
❌ 不要使用
```
<header>
```
或
```
<footer>
```
元素
❌ 不要使用
```
<main>
```
包裹元素
❌ 不要添加头部内容（元标签、标题等 - 这些来自项目的head.html）

结构格式：

html

<div>
  <!-- Section 1 content -->
</div>
<div>
  <!-- Section 2 content with section-metadata if needed -->
  <div class="section-metadata">
    <div>
      <div>Style</div>
      <div>grey</div>
    </div>
  </div>
  <!-- Section 2 blocks/content -->
</div>
<div>
  <!-- Section 3 content -->
</div>

详细块结构模式： 请参阅

../page-import/resources/html-structure.md

Section Metadata Application

章节元数据应用

Apply validated decisions from authoring-analysis Step 3e:

WITH section-metadata (section provides container styling):

html

<div>
  <div class="section-metadata">
    <div>
      <div>Style</div>
      <div>dark</div>
    </div>
  </div>
  <div class="tabs">
    <!-- Tabs block content -->
  </div>
</div>

WITHOUT section-metadata (background is block-specific):

html

<div>
  <div class="hero">
    <!-- Hero block content with its own dark background -->
  </div>
</div>

Important:

Only migrate visible body content sections (skip header, navigation, footer - auto-generated)
Use consistent style names from identify-page-structure
Apply validated decisions from authoring-analysis Step 3e - Skip section-metadata for single-block sections where background is block-specific
Place
```
section-metadata
```
div at the start of each section that needs styling
The metadata div will be processed and removed by the platform
Each section is a separate top-level
```
<div>
```
element

应用来自authoring-analysis步骤3e的已验证决策：

包含section-metadata（章节提供容器样式）：

html

<div>
  <div class="section-metadata">
    <div>
      <div>Style</div>
      <div>dark</div>
    </div>
  </div>
  <div class="tabs">
    <!-- Tabs block content -->
  </div>
</div>

不包含section-metadata（背景由块自身定义）：

html

<div>
  <div class="hero">
    <!-- Hero block content with its own dark background -->
  </div>
</div>

注意事项：

仅迁移可见的正文内容章节（跳过头部、导航、页脚 - 这些会自动生成）
使用来自identify-page-structure的统一样式名称
应用来自authoring-analysis步骤3e的已验证决策 - 对于背景由块自身定义的单块章节，跳过section-metadata
将
```
section-metadata
```
div放在每个需要样式的章节的开头
元数据div会被平台处理并移除
每个章节是独立的顶级
```
<div>
```
元素

Page Metadata Block

页面元数据块

Unless user explicitly requested to skip metadata, use the metadata extracted from scrape-webpage to generate a metadata block.

Process:

1. Review extracted metadata from metadata.json

2. Map each property to standard format:

Title:

Compare source
```
title
```
(or
```
og:title
```
) with first H1 on page
If matches first H1 → Omit (platform defaults to H1)
If differs → Include as
```
title
```
property

Description:

Compare source
```
description
```
(or
```
og:description
```
) with first paragraph
If matches first paragraph → Consider omitting (platform defaults to first paragraph)
If differs OR more descriptive → Include as
```
description
```
property
Check: 150-160 characters ideal

Image:

Check source
```
og:image
```
If matches first content image → Consider omitting (platform defaults to first image)
If custom social image → Include as
```
image
```
property
Ensure absolute URL or correct relative path
Check: 1200x630 pixels recommended

Canonical:

If points to same page URL → Omit (platform auto-generates)
If points to different page → Include as
```
canonical
```
property

Tags:

Map
```
article:tag
```
or
```
keywords
```
→ comma-separated
```
tags
```
property

Properties to SKIP (platform auto-populates):

og:url

og:title

og:description

twitter:title

twitter:description

twitter:image

```
viewport
```
,
```
charset
```
,
```
X-UA-Compatible
```
(belong in head.html)

3. Generate metadata block HTML:

html

<div>
  <div class="metadata">
    <div>
      <div>title</div>
      <div>[Your mapped title]</div>
    </div>
    <div>
      <div>description</div>
      <div>[Your mapped description]</div>
    </div>
    <!-- Only include image if custom -->
    <!-- Only include canonical if differs from page URL -->
    <!-- Only include tags if present -->
  </div>
</div>

Append metadata block as the last section div at the end of the HTML file.

Detailed guidance: See

resources/metadata-extraction.md

and

resources/metadata-mapping.md

除非用户明确要求跳过元数据，否则使用来自scrape-webpage提取的元数据生成元数据块。

流程：

1. 查看来自metadata.json的提取元数据

2. 将每个属性映射为标准格式：

标题：

比较源
```
title
```
（或
```
og:title
```
）与页面上的第一个H1
如果与第一个H1匹配 → 省略（平台默认使用H1）
如果不同 → 作为
```
title
```
属性包含

描述：

比较源
```
description
```
（或
```
og:description
```
）与第一段内容
如果与第一段匹配 → 考虑省略（平台默认使用第一段）
如果不同或更具描述性 → 作为
```
description
```
属性包含
检查：理想长度为150-160字符

图片：

检查源
```
og:image
```
如果与第一个内容图片匹配 → 考虑省略（平台默认使用第一张图片）
如果是自定义社交图片 → 作为
```
image
```
属性包含
确保使用绝对URL或正确的相对路径
检查：建议尺寸为1200x630像素

规范链接：

如果指向同一页面URL → 省略（平台自动生成）
如果指向不同页面 → 作为
```
canonical
```
属性包含

标签：

映射
```
article:tag
```
或
```
keywords
```
→ 以逗号分隔的
```
tags
```
属性

需要跳过的属性（平台自动填充）：

og:url

og:title

og:description

twitter:title

twitter:description

twitter:image

```
viewport
```
,
```
charset
```
,
```
X-UA-Compatible
```
（属于head.html）

3. 生成元数据块HTML：

html

<div>
  <div class="metadata">
    <div>
      <div>title</div>
      <div>[您的映射标题]</div>
    </div>
    <div>
      <div>description</div>
      <div>[您的映射描述]</div>
    </div>
    <!-- 仅在自定义时包含图片 -->
    <!-- 仅当与页面URL不同时包含规范链接 -->
    <!-- 仅在存在时包含标签 -->
  </div>
</div>

将元数据块作为最后一个章节div追加到HTML文件末尾。

详细指南： 请参阅

resources/metadata-extraction.md

和

resources/metadata-mapping.md

Images Folder Management (CRITICAL)

图片文件夹管理（关键）

The images are currently in

./import-work/images/

and the HTML references them as

./images/...

. You MUST handle the images folder correctly:

Step 1: Determine the correct images folder location

Based on

paths.htmlFilePath

from metadata.json:

HTML file:
```
us/en/about.plain.html
```
→ Images should be at:
```
us/en/images/
```

HTML file:

products/widget.plain.html

→ Images should be at:

products/images/

HTML file:
```
index.plain.html
```
→ Images should be at:
```
images/
```

Rule: Images folder goes in the same directory as the HTML file.

Step 2: Copy the images folder

bash

undefined

图片当前位于

./import-work/images/

，HTML中引用为

./images/...

。您必须正确处理图片文件夹：

步骤1：确定正确的图片文件夹位置

根据metadata.json中的

paths.htmlFilePath

：

HTML文件：
```
us/en/about.plain.html
```
→ 图片应位于：
```
us/en/images/
```

HTML文件：

products/widget.plain.html

→ 图片应位于：

products/images/

HTML文件：
```
index.plain.html
```
→ 图片应位于：
```
images/
```

规则： 图片文件夹与HTML文件位于同一目录。

步骤2：复制图片文件夹

bash

undefined

Example: If HTML is at us/en/about.plain.html

示例：如果HTML位于us/en/about.plain.html

mkdir -p us/en/images cp -r ./import-work/images/* us/en/images/


**Step 3: Verify image paths in HTML are correct**

The HTML should already reference images as `./images/...` which is correct for files in the same directory. No path changes needed in the HTML.

**Example:**

HTML location: us/en/about.plain.html Images location: us/en/images/ Image reference in HTML: <img src="./images/abc123.jpg"> Result: ✅ Correct - browser resolves to us/en/images/abc123.jpg

---

mkdir -p us/en/images cp -r ./import-work/images/* us/en/images/


**步骤3：验证HTML中的图片路径是否正确**

HTML中已将图片引用为`./images/...`，这对于同一目录下的文件是正确的。无需修改HTML中的路径。

**示例：**

HTML位置：us/en/about.plain.html 图片位置：us/en/images/ HTML中的图片引用：<img src="./images/abc123.jpg"> 结果：✅ 正确 - 浏览器会解析为us/en/images/abc123.jpg

---

Save HTML File

保存HTML文件

Save to: Use

paths.htmlFilePath

from metadata.json (e.g.,

us/en/about.plain.html

)

Read the metadata.json file from scrape-webpage to get the correct file path.

保存路径： 使用metadata.json中的

paths.htmlFilePath

（例如

us/en/about.plain.html

）

从scrape-webpage获取metadata.json文件以获取正确的文件路径。

Validation Checklist (MANDATORY)

验证清单（必填）

Before proceeding to preview-import skill, verify:

✅ Section count: HTML has the same number of top-level
```
<div>
```
sections as identified in identify-page-structure
✅ All sequences: Every content sequence from authoring-analysis appears in the HTML
✅ No truncation: No "..." or "" or similar placeholders
✅ Complete text: All headings, paragraphs, and text from cleaned.html are present
✅ All images: Every image reference from the scraped page is included
✅ HTML file saved: HTML file written to disk at the correct path
✅ Images folder copied: Images folder exists in the same directory as the HTML file
✅ Images accessible: Verify that at least one image file exists in the copied images folder

If any validation check fails, STOP and fix before proceeding.

在进入preview-import技能之前，请验证：

✅ 章节数量：HTML中的顶级
```
<div>
```
章节数量与identify-page-structure中识别的数量一致
✅ 所有序列：来自authoring-analysis的每个内容序列都出现在HTML中
✅ 无截断：没有“...”或“”或类似占位符
✅ 文本完整：包含cleaned.html中的所有标题、段落和文本
✅ 图片完整：包含来自抓取页面的所有图片引用
✅ HTML文件已保存：HTML文件已写入磁盘的正确路径
✅ 图片文件夹已复制：图片文件夹与HTML文件位于同一目录
✅ 图片可访问：验证复制的图片文件夹中至少存在一个图片文件

如果任何验证检查失败，请停止并修复后再继续。

Output

输出结果

This skill provides:

✅ HTML file at correct path (e.g.,
```
us/en/about.plain.html
```
)
✅ Images folder in same directory (e.g.,
```
us/en/images/
```
)
✅ Complete content import (all sections)
✅ Proper block structure
✅ Section metadata applied per validation
✅ Page metadata block included

Next step: Pass HTML file path to preview-import skill

此技能提供以下内容：

✅ 位于正确路径的HTML文件（例如
```
us/en/about.plain.html
```
）
✅ 与HTML文件同目录的图片文件夹（例如
```
us/en/images/
```
）
✅ 完整的内容导入（所有章节）
✅ 正确的块结构
✅ 已根据验证结果应用章节元数据
✅ 包含页面元数据块

下一步： 将HTML文件路径传递给preview-import技能

generate-import-html

Original

Translation

Generate Import HTML

生成导入用HTML

When to Use This Skill

何时使用此技能

Prerequisites

前置条件

Related Skills

相关技能

⚠️ CRITICAL REQUIREMENT: Complete Content Import

⚠️ 关键要求：完整内容导入

HTML Generation Workflow

HTML生成流程

Structure Requirements

结构要求

Section Metadata Application

章节元数据应用

Page Metadata Block

页面元数据块

Images Folder Management (CRITICAL)

图片文件夹管理（关键）

Example: If HTML is at us/en/about.plain.html

示例：如果HTML位于us/en/about.plain.html

Save HTML File

保存HTML文件

Validation Checklist (MANDATORY)

验证清单（必填）

Output

输出结果