identify-page-structure

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Identify Page Structure

识别页面结构

Analyze webpage structure using two-level hierarchy: sections, then content sequences within each section.
使用两级层级结构分析网页结构:先识别区块,再识别每个区块内的内容序列。

When to Use This Skill

何时使用该Skill

Use this skill when:
  • You have scraped webpage output (screenshot, HTML, metadata)
  • Need to identify section boundaries and content sequences
  • Ready to understand page structure before making authoring decisions
Invoked by: page-import skill (Step 2)
在以下场景使用本Skill:
  • 你已获取网页抓取输出(截图、HTML、元数据)
  • 需要识别区块边界和内容序列
  • 准备在做出创作决策前理解页面结构
调用方: page-import skill(步骤2)

Prerequisites

前置条件

From scrape-webpage skill, you need:
  • ✅ screenshot.png showing full page
  • ✅ cleaned.html with page content
  • ✅ metadata.json with paths
从scrape-webpage skill中,你需要:
  • ✅ 显示完整页面的screenshot.png
  • ✅ 包含页面内容的cleaned.html
  • ✅ 包含路径的metadata.json

Related Skills

相关Skills

  • page-import - Orchestrator that invokes this skill
  • scrape-webpage - Provides input (screenshot, HTML)
  • page-decomposition - This skill invokes it for EACH section
  • block-inventory - This skill invokes it to survey available blocks
  • authoring-analysis - Uses this skill's output to make authoring decisions
  • page-import - 调用本Skill的编排器
  • scrape-webpage - 提供输入(截图、HTML)
  • page-decomposition - 本Skill会为每个区块调用该Skill
  • block-inventory - 本Skill会调用该Skill来调研可用组件块
  • authoring-analysis - 使用本Skill的输出做出创作决策

Key Concepts

核心概念

CRITICAL: Content follows a strict two-level hierarchy:
DOCUMENT
├── SECTION (top-level container with optional metadata)
│   ├── Content Sequence 1 (default content OR block)
│   ├── Content Sequence 2 (default content OR block)
│   └── ...
├── SECTION
│   └── Content Sequence 1
└── ...
This skill analyzes BOTH levels:
  • Level 1: Section boundaries (Step 2a)
  • Level 2: Content sequences within EACH section (Step 2b per section)
关键提示: 内容遵循严格的两级层级结构:
DOCUMENT
├── SECTION (顶级容器,可包含可选元数据)
│   ├── Content Sequence 1(默认内容或组件块)
│   ├── Content Sequence 2(默认内容或组件块)
│   └── ...
├── SECTION
│   └── Content Sequence 1
└── ...
本Skill会分析这两个层级:
  • 层级1:区块边界(步骤2a)
  • 层级2:每个区块内的内容序列(每个区块对应步骤2b)

Structure Identification Workflow

结构识别工作流

Step 2a: Identify Section Boundaries (Level 1)

步骤2a:识别区块边界(层级1)

Examine the screenshot to find visual/thematic breaks that indicate new sections.
Visual cues for section boundaries:
  • Background color changes (white → grey → dark → white)
  • Spacing/padding changes (tight → wide → normal)
  • Clear horizontal breaks or dividers
  • Thematic content shifts
What to exclude:
  • Header/navigation (auto-populated)
  • Footer (auto-populated)
  • Cookie banners, popups
For each section, note:
  • Section number (sequential: 1, 2, 3...)
  • Visual style (light, dark, grey, accent)
  • Brief overview of what's in it
Example output:
Section 1: light background, hero content
Section 2: light background, grid of features
Section 3: grey background, article cards
Section 4: dark background, tabs

查看截图以找到指示新区块的视觉/主题断点。
区块边界的视觉线索:
  • 背景颜色变化(白色 → 灰色 → 深色 → 白色)
  • 间距/内边距变化(紧凑 → 宽松 → 正常)
  • 清晰的水平分隔线或分割符
  • 主题内容切换
需要排除的内容:
  • 页眉/导航栏(自动填充)
  • 页脚(自动填充)
  • Cookie提示框、弹窗
为每个区块记录:
  • 区块编号(按顺序:1、2、3...)
  • 视觉样式(浅色、深色、灰色、强调色)
  • 区块内容的简要概述
示例输出:
Section 1: light background, hero content
Section 2: light background, grid of features
Section 3: grey background, article cards
Section 4: dark background, tabs

Step 2b: Analyze Content Sequences Within Each Section (Level 2)

步骤2b:分析每个区块内的内容序列(层级2)

For EACH section identified in Step 2a, analyze its internal content sequences.
What is a "content sequence"? A vertical flow of related content that will become EITHER:
  • Default content (headings, paragraphs, lists, inline images)
  • A block (structured, repeating, or interactive component)
Breaking points between sequences:
  • Change from default content → block
  • Change from block → different block
  • Change from block → default content
INVOKE page-decomposition skill FOR EACH SECTION to get neutral descriptions.
For each section, get:
  • Sequence 1: [Neutral description - NO block names yet]
  • Sequence 2: [Neutral description]
  • ...
Example output:
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

Section 3 (grey):
  - Sequence 1: Eyebrow text, heading, paragraph, button
  - Sequence 2: Four items in grid, each with image, category tag, heading, description

Section 4 (dark):
  - Sequence 1: Tab navigation with three switchable content panels

针对步骤2a中识别的每个区块,分析其内部的内容序列。
什么是“内容序列”? 一组垂直排列的相关内容,可转换为:
  • 默认内容(标题、段落、列表、内联图片)
  • 组件块(结构化、重复或交互式组件)
序列之间的断点:
  • 从默认内容 → 组件块的切换
  • 从一个组件块 → 另一个不同组件块的切换
  • 从组件块 → 默认内容的切换
为每个区块调用page-decomposition skill以获取中立描述。
为每个区块获取:
  • Sequence 1: [中立描述 - 暂不使用组件块名称]
  • Sequence 2: [中立描述]
  • ...
示例输出:
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

Section 3 (grey):
  - Sequence 1: Eyebrow text, heading, paragraph, button
  - Sequence 2: Four items in grid, each with image, category tag, heading, description

Section 4 (dark):
  - Sequence 1: Tab navigation with three switchable content panels

Step 2.5: Survey Available Blocks

步骤2.5:调研可用组件块

STOP: Before making any authoring decisions, understand what blocks are available.
INVOKE block-inventory skill to catalog available blocks.
Why this matters: Real authors see a block library and choose from available options. You need the same context to make authentic authoring decisions following David's Model.
What this provides:
  • Local blocks already in project
  • Common Block Collection blocks that can be added
  • Purpose/description for each block
  • Live example URLs
Example output:
Available Blocks:

LOCAL BLOCKS:
- custom-banner: Special promotional banner
- testimonial-slider: Customer testimonials carousel

BLOCK COLLECTION AVAILABLE:
- hero: Large heading, text, buttons for page intro
- cards: Grid of items with images/text
- columns: Side-by-side content layout
- accordion: Expandable Q&A sections
- tabs: Switchable content panels
- carousel: Rotating image/content displays
- quote: Highlighted testimonials
- fragment: Reusable content sections

注意:在做出任何创作决策前,请先了解可用的组件块。
调用block-inventory skill以编目可用组件块。
为什么这很重要: 实际创作者会查看组件块库并从中选择可用选项。你需要相同的上下文,才能遵循David模型做出真实的创作决策。
这一步提供的信息:
  • 项目中已有的本地组件块
  • 可添加的通用组件集合中的组件块
  • 每个组件块的用途/描述
  • 在线示例URL
示例输出:
Available Blocks:

LOCAL BLOCKS:
- custom-banner: Special promotional banner
- testimonial-slider: Customer testimonials carousel

BLOCK COLLECTION AVAILABLE:
- hero: Large heading, text, buttons for page intro
- cards: Grid of items with images/text
- columns: Side-by-side content layout
- accordion: Expandable Q&A sections
- tabs: Switchable content panels
- carousel: Rotating image/content displays
- quote: Highlighted testimonials
- fragment: Reusable content sections

Output Format

输出格式

This skill provides complete page structure:
1. Section boundaries with styling:
Section 1: light background
Section 2: light background
Section 3: grey background (#f5f5f5)
Section 4: dark background (#1a1a1a)
2. Content sequences per section (neutral descriptions):
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two call-to-action buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Single centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

[Continue for all sections...]
3. Block palette:
LOCAL BLOCKS: [list]
BLOCK COLLECTION AVAILABLE: [list with purposes]
Next step: Pass these outputs to authoring-analysis skill

本Skill提供完整的页面结构:
1. 带样式的区块边界:
Section 1: light background
Section 2: light background
Section 3: grey background (#f5f5f5)
Section 4: dark background (#1a1a1a)
2. 每个区块的内容序列(中立描述):
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two call-to-action buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Single centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

[Continue for all sections...]
3. 组件块面板:
LOCAL BLOCKS: [list]
BLOCK COLLECTION AVAILABLE: [list with purposes]
下一步: 将这些输出传递给authoring-analysis skill

Key Principles

核心原则

Two-level analysis is mandatory:
  • You MUST identify sections first (2a)
  • Then analyze each section's content sequences (2b)
  • Don't skip levels or combine them
Stay neutral at this stage:
  • Describe WHAT you see, not WHAT it should be
  • "Grid of items with images" not "Cards block"
  • Authoring decisions come in next skill
Block inventory before decisions:
  • Survey blocks BEFORE making any authoring choices
  • Authors see a library and choose - you need same context
两级分析是强制性要求:
  • 必须先识别区块(步骤2a)
  • 然后分析每个区块的内容序列(步骤2b)
  • 不得跳过层级或合并步骤
此阶段保持中立:
  • 描述你看到的内容,而非它应该是什么
  • 例如:描述为“带图片的项目网格”而非“Cards组件块”
  • 创作决策将在下一个Skill中进行
先调研组件块再做决策:
  • 在做出任何创作选择前,先调研组件块
  • 创作者会查看组件库再选择,你需要相同的上下文