seo-cluster

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Semantic Topic Clustering (v1.9.0)

语义主题聚类（v1.9.0）

SERP-overlap-driven keyword clustering for content architecture. Groups keywords by how Google actually ranks them (shared top-10 results), not by text similarity. Designs hub-and-spoke content clusters with internal link matrices and generates interactive cluster map visualizations.

Scripts: Located at the plugin root

scripts/

directory.

基于SERP重叠的关键词聚类，用于内容架构规划。根据Google的实际排名情况（共享的前10条结果）对关键词进行分组，而非基于文本相似度。设计带有内部链接矩阵的中心辐射式内容集群，并生成交互式集群地图可视化效果。

脚本文件：位于插件根目录的

scripts/

文件夹中。

Quick Reference

快速参考

Command	What it does
`/seo cluster plan <seed-keyword>`	Full planning workflow: expand, cluster, architect, visualize
`/seo cluster plan --from strategy`	Import from existing `/seo plan` output
`/seo cluster execute`	Execute plan: create content via claude-blog or output briefs
`/seo cluster map`	Regenerate the interactive cluster visualization

命令	功能
`/seo cluster plan <seed-keyword>`	完整规划工作流：关键词拓展、聚类、架构设计、可视化
`/seo cluster plan --from strategy`	从现有 `/seo plan` 输出结果导入
`/seo cluster execute`	执行规划：通过claude-blog创建内容或生成内容简报
`/seo cluster map`	重新生成交互式集群可视化地图

Planning Workflow

规划工作流

Step 1: Seed Keyword Expansion

步骤1：种子关键词拓展

Expand the seed keyword into 30-50 variants using WebSearch:

Related searches — Search the seed, extract "related searches" and "people also search for"
People Also Ask (PAA) — Extract all PAA questions from SERP results
Long-tail modifiers — Append common modifiers: "best", "how to", "vs", "for beginners", "tools", "examples", "guide", "template", "mistakes", "checklist"
Question mining — Generate who/what/when/where/why/how variants
Intent modifiers — Add commercial modifiers: "pricing", "review", "alternative", "comparison", "free", "top"

Deduplication: Normalize variants (lowercase, strip articles), remove exact duplicates. Target: 30-50 unique keyword variants. If under 30, run a second expansion pass with the top PAA questions as seeds.

通过WebSearch将种子关键词拓展为30-50个变体：

相关搜索 — 搜索种子关键词，提取「相关搜索」和「用户还搜索了」内容
People Also Ask（PAA） — 从SERP结果中提取所有PAA问题
长尾修饰词 — 添加常见修饰词：「best」「how to」「vs」「for beginners」「tools」「examples」「guide」「template」「mistakes」「checklist」
问题挖掘 — 生成who/what/when/where/why/how类型的变体
意图修饰词 — 添加商业类修饰词：「pricing」「review」「alternative」「comparison」「free」「top」

去重处理：标准化关键词变体（转为小写、去除冠词），移除完全重复的内容。目标：30-50个独特的关键词变体。若不足30个，以热门PAA问题为种子进行第二轮拓展。

Step 2: SERP Overlap Clustering

步骤2：SERP重叠聚类

This is the core differentiator. Load

references/serp-overlap-methodology.md

for the full algorithm.

Process:

Group keywords by initial intent guess (reduces pairwise comparisons)
For each candidate pair within a group, WebSearch both keywords
Count shared URLs in the top 10 organic results (ignore ads, featured snippets, PAA)
Apply thresholds:

Shared Results	Relationship	Action
7-10	Same post	Merge into single target page
4-6	Same cluster	Group under same spoke cluster
2-3	Interlink	Place in adjacent clusters, add cross-links
0-1	Separate	Assign to different clusters or exclude

Optimization: With 40 keywords, full pairwise = 780 comparisons. Instead:

Pre-group by intent (4 groups of ~10 = 4 x 45 = 180 comparisons)
Only cross-check group boundary keywords
Skip pairs where both are long-tail variants of the same head term (assume same cluster)

DataForSEO integration: If DataForSEO MCP is available, use

serp_organic_live_advanced

instead of WebSearch for SERP data. Run

python scripts/dataforseo_costs.py check serp_organic_live_advanced --count N

before each batch. If

"status": "needs_approval"

, show cost estimate and ask user. If

"status": "blocked"

, fall back to WebSearch.

这是核心差异化功能。查看

references/serp-overlap-methodology.md

获取完整算法说明。

处理流程：

根据初步意图猜测对关键词进行分组（减少两两对比次数）
对每组内的候选关键词对，分别进行WebSearch
统计前10条自然搜索结果中共享的URL数量（忽略广告、精选摘要、PAA）
应用以下阈值：

共享结果数量	关系	操作
7-10	同一页面	合并为单个目标页面
4-6	同一集群	归为同一分支集群
2-3	互链关联	放入相邻集群并添加交叉链接
0-1	相互独立	分配至不同集群或排除

优化方案：若有40个关键词，完整两两对比需780次。可通过以下方式优化：

先按意图预分组（4组各约10个关键词 = 4×45=180次对比）
仅交叉检查组边界的关键词
跳过属于同一核心词的长尾变体对（默认归为同一集群）

DataForSEO集成：若DataForSEO MCP可用，使用

serp_organic_live_advanced

接口替代WebSearch获取SERP数据。每次批量处理前运行

python scripts/dataforseo_costs.py check serp_organic_live_advanced --count N

。若返回

"status": "needs_approval"

，则展示成本估算并询问用户；若返回

"status": "blocked"

，则 fallback 到WebSearch。

Step 3: Intent Classification

步骤3：意图分类

Classify each keyword into one of four intent categories:

Intent	Signals	Include in Clusters?
Informational	how, what, why, guide, tutorial, learn	Yes
Commercial	best, top, review, comparison, vs, alternative	Yes
Transactional	buy, price, discount, coupon, order, sign up	Yes
Navigational	brand names, specific product names, login	No (exclude)

Remove navigational keywords from clustering. Flag borderline cases for manual review. Keywords can have mixed intent (e.g., "best CRM software" is both commercial and informational) -- classify by dominant intent.

将每个关键词归类为以下四种意图之一：

意图类型	识别信号	是否纳入聚类？
信息型	how、what、why、guide、tutorial、learn	是
商业型	best、top、review、comparison、vs、alternative	是
交易型	buy、price、discount、coupon、order、sign up	是
导航型	品牌名、具体产品名、login	否（排除）

从聚类中移除导航型关键词。标记边界案例供人工审核。关键词可能存在混合意图（例如「best CRM software」同时属于商业型和信息型）——按主导意图分类。

Step 4: Hub-and-Spoke Architecture

步骤4：中心辐射式架构设计

Load

references/hub-spoke-architecture.md

for full specifications.

Design the cluster structure:

Select the pillar keyword — Highest volume, broadest intent, most SERP overlap with other keywords
Group spokes into clusters — Each cluster is a subtopic area (2-5 clusters per pillar)
Assign posts to clusters — Each cluster gets 2-4 spoke posts
Select templates per post — Based on intent classification:

Intent Pattern	Template Options
Informational (broad)	ultimate-guide
Informational (how)	how-to
Informational (list)	listicle
Informational (concept)	explainer
Commercial (compare)	comparison
Commercial (evaluate)	review
Commercial (rank)	best-of
Transactional	landing-page

Set word count targets:
- Pillar page: 2500-4000 words
- Spoke posts: 1200-1800 words
Cannibalization check — No two posts share the same primary keyword. If SERP overlap is 7+, merge those keywords into a single post targeting both.

查看

references/hub-spoke-architecture.md

获取完整规范。

设计集群结构：

选择支柱关键词 — 搜索量最高、意图最宽泛、与其他关键词SERP重叠度最高的关键词
将分支归为集群 — 每个集群对应一个子主题领域（每个支柱对应2-5个集群）
为集群分配页面 — 每个集群包含2-4个分支页面
为每个页面选择模板 — 根据意图分类选择：

意图模式	模板选项
信息型（宽泛）	ultimate-guide（终极指南）
信息型（操作类）	how-to（操作教程）
信息型（列表类）	listicle（清单文）
信息型（概念类）	explainer（概念解析）
商业型（对比类）	comparison（对比评测）
商业型（评估类）	review（产品评测）
商业型（排名类）	best-of（最佳榜单）
交易型	landing-page（落地页）

设置字数目标：
- 支柱页面：2500-4000字
- 分支页面：1200-1800字
关键词自竞争检查 — 任意两个页面不得共享核心关键词。若SERP重叠度为7+，则将这些关键词合并为单个页面，同时针对两个关键词进行优化。

Step 5: Internal Link Matrix

步骤5：内部链接矩阵

Design the bidirectional linking structure:

Link Type	Direction	Requirement
Spoke to pillar	spoke -> pillar	Mandatory (every spoke)
Pillar to spoke	pillar -> spoke	Mandatory (every spoke)
Spoke to spoke (within cluster)	spoke <-> spoke	2-3 links per post
Cross-cluster	spoke -> spoke (other cluster)	0-1 links per post

Rules:

Every post must have minimum 3 incoming internal links
No orphan pages (every post reachable from pillar in 2 clicks)
Anchor text must use target keyword or close variant (no "click here")
Link placement: within body content, not just navigation/sidebar

Generate the link matrix as a JSON adjacency list:

json

{
  "links": [
    { "from": "pillar", "to": "cluster-0-post-0", "type": "mandatory", "anchor": "keyword" },
    { "from": "cluster-0-post-0", "to": "pillar", "type": "mandatory", "anchor": "keyword" }
  ]
}

设计双向链接结构：

链接类型	方向	要求
分支到支柱	分支 → 支柱	强制要求（每个分支页面都需添加）
支柱到分支	支柱 → 分支	强制要求（每个分支页面都需添加）
集群内分支互链	分支 ↔ 分支	每个页面添加2-3条链接
跨集群链接	分支 → 其他集群分支	每个页面添加0-1条链接

规则：

每个页面至少包含3条内部入链
无孤立页面（所有页面都能从支柱页面通过2次点击到达）
锚文本需使用目标关键词或近似变体（禁止使用「点击这里」）
链接需放置在正文中，而非仅在导航栏/侧边栏

生成JSON邻接列表格式的链接矩阵：

json

{
  "links": [
    { "from": "pillar", "to": "cluster-0-post-0", "type": "mandatory", "anchor": "keyword" },
    { "from": "cluster-0-post-0", "to": "pillar", "type": "mandatory", "anchor": "keyword" }
  ]
}

Step 6: Interactive Cluster Map

步骤6：交互式集群地图

Generate

cluster-map.html

using the template at

templates/cluster-map.html

Read the template file

Build the

CLUSTER_DATA

JSON object from the cluster plan:

javascript

{
  pillar: { title, keyword, volume, template, wordCount, url },
  clusters: [{ name, color, posts: [{ title, keyword, volume, template, wordCount, url, status }] }],
  links: [{ from, to, type }],
  meta: { totalPosts, totalClusters, totalLinks, estimatedWords }
}

Replace the
```
CLUSTER_DATA
```
placeholder in the template with the actual JSON
Write the completed HTML file to the output directory
Inform user: "Open
```
cluster-map.html
```
in a browser to explore the interactive cluster map."

使用

templates/cluster-map.html

模板生成

cluster-map.html

文件。

读取模板文件

根据集群规划构建

CLUSTER_DATA

JSON对象：

javascript

{
  pillar: { title, keyword, volume, template, wordCount, url },
  clusters: [{ name, color, posts: [{ title, keyword, volume, template, wordCount, url, status }] }],
  links: [{ from, to, type }],
  meta: { totalPosts, totalClusters, totalLinks, estimatedWords }
}

将模板中的
```
CLUSTER_DATA
```
占位符替换为实际JSON数据
将完成的HTML文件写入输出目录
告知用户：「在浏览器中打开
```
cluster-map.html
```
即可浏览交互式集群地图。」

Strategy Import

策略导入

When invoked with

--from strategy

Look for the most recent
```
/seo plan
```
output in the current directory (search for files matching
```
*SEO*Plan*
```
,
```
*strategy*
```
,
```
*content-strategy*
```
)
Parse markdown tables for: keywords, page types, content pillars, URL structures
Validate extracted data: check for duplicates, missing keywords, incomplete entries
Enrich with SERP data: run SERP overlap analysis on extracted keywords
Build cluster plan using the imported keywords as the starting set (skip Step 1)

If no strategy file is found, prompt the user: "No existing SEO plan found in the current directory. Run

/seo plan

first, or provide a seed keyword for fresh clustering."

当使用

--from strategy

参数调用时：

在当前目录中查找最新的
```
/seo plan
```
输出结果（搜索匹配
```
*SEO*Plan*
```
、
```
*strategy*
```
、
```
*content-strategy*
```
的文件）
解析markdown表格中的关键词、页面类型、内容支柱、URL结构
验证提取的数据：检查重复项、缺失关键词和不完整条目
补充SERP数据：对提取的关键词进行SERP重叠分析
以导入的关键词为起始集构建集群规划（跳过步骤1）

若未找到策略文件，提示用户：「当前目录中未找到现有SEO规划。请先运行

/seo plan

，或提供种子关键词进行全新聚类。」

Execution Workflow

执行工作流

When

/seo cluster execute

is invoked:

当调用

/seo cluster execute

时：

Check for claude-blog

检查claude-blog是否安装

Test: Does ~/.claude/skills/blog/SKILL.md exist?

If claude-blog IS installed:

Load
```
references/execution-workflow.md
```
for the full algorithm
Read
```
cluster-plan.json
```
from the current directory
Check for resume state: scan output directory for already-written posts
Execute in priority order: pillar first, then spokes by volume (highest first)
For each post, invoke the
```
blog-write
```
skill with cluster context:
- Cluster role (pillar or spoke)
- Position in cluster (cluster index, post index)
- Target keyword and secondary keywords
- Template type and word count target
- Internal links to include (with anchors)
- Links to receive from future posts (placeholder markers)
After each post is written, scan previous posts for backward link placeholders and inject the new post's URL
After all posts are written, generate the cluster scorecard

If claude-blog is NOT installed:

Generate detailed content briefs for each post in the cluster plan
Each brief includes:
- Title and meta description
- Primary keyword and secondary keywords
- Template type and suggested structure (H2/H3 outline)
- Word count target
- Internal links to include (with anchor text)
- Key points to cover
- Competing pages to differentiate from
Write briefs to
```
cluster-briefs/
```
directory as individual markdown files
Inform user: "Install claude-blog to auto-create content. Briefs saved to
```
cluster-briefs/
```
."

检测：~/.claude/skills/blog/SKILL.md 是否存在？

若已安装claude-blog：

查看
```
references/execution-workflow.md
```
获取完整算法
从当前目录读取
```
cluster-plan.json
```
检查恢复状态：扫描输出目录中已生成的页面
按优先级执行：先处理支柱页面，再按搜索量从高到低处理分支页面
对每个页面，结合集群上下文调用
```
blog-write
```
技能：
- 集群角色（支柱或分支）
- 在集群中的位置（集群索引、页面索引）
- 目标关键词和次要关键词
- 模板类型和字数目标
- 需包含的内部链接（带锚文本）
- 未来页面需添加的链接占位符
每个页面生成后，扫描之前的页面以查找反向链接占位符，并注入新页面的URL
所有页面生成完成后，生成集群评分卡

若未安装claude-blog：

为集群规划中的每个页面生成详细的内容简报
每份简报包含：
- 标题和元描述
- 核心关键词和次要关键词
- 模板类型和建议结构（H2/H3大纲）
- 字数目标
- 需包含的内部链接（带锚文本）
- 需覆盖的核心要点
- 需差异化的竞品页面
将简报写入
```
cluster-briefs/
```
目录，保存为独立的markdown文件
告知用户：「安装claude-blog即可自动创建内容。简报已保存至
```
cluster-briefs/
```
。」

Cluster Scorecard

集群评分卡

Post-execution quality report. Run automatically after

/seo cluster execute

or on demand via analysis of the output directory.

Metric	Target	How Measured
Coverage	100%	Posts written / posts planned
Link Density	3+ per post	Count internal links per post
Orphan Pages	0	Posts with < 1 incoming link
Cannibalization	0 conflicts	Check for duplicate primary keywords
Image Count	1+ per post	Posts with at least one image
Pillar Links	100%	All spokes link to pillar and vice versa
Cross-Links	80%+	Recommended spoke-to-spoke links implemented
Content Gaps	0	Planned posts that were skipped or incomplete

执行后的质量报告。在

/seo cluster execute

完成后自动生成，或通过分析输出目录按需生成。

指标	目标值	测量方式
覆盖度	100%	已生成页面数 / 规划页面数
链接密度	每页≥3条	统计每个页面的内部链接数量
孤立页面	0	入链数<1的页面数量
关键词自竞争	0冲突	检查是否存在重复核心关键词
图片数量	每页≥1张	包含至少一张图片的页面数量
支柱链接	100%	所有分支页面均与支柱页面互链
交叉链接	≥80%	已实现的推荐分支间链接比例
内容缺口	0	被跳过或未完成的规划页面数量

Map Regeneration

地图重新生成

When

/seo cluster map

is invoked:

Read
```
cluster-plan.json
```
from the current directory
Scan output directory and update post statuses (planned vs written)
Regenerate
```
cluster-map.html
```
with updated statuses
Report: posts written vs planned, link completion percentage

当调用

/seo cluster map

时：

从当前目录读取
```
cluster-plan.json
```
扫描输出目录并更新页面状态（规划中 vs 已生成）
使用更新后的状态重新生成
```
cluster-map.html
```
报告：已生成页面数 vs 规划页面数、链接完成率

Output Files

输出文件

All outputs are written to the current working directory:

File	Description
`cluster-plan.json`	Machine-readable cluster plan (full data)
`cluster-plan.md`	Human-readable cluster plan summary
`cluster-map.html`	Interactive SVG visualization
`cluster-briefs/`	Content briefs (if no claude-blog)
`cluster-scorecard.md`	Post-execution quality report

所有输出文件均写入当前工作目录：

文件	描述
`cluster-plan.json`	机器可读的完整集群规划数据
`cluster-plan.md`	人类可读的集群规划摘要
`cluster-map.html`	交互式SVG可视化地图
`cluster-briefs/`	内容简报目录（未安装claude-blog时生成）
`cluster-scorecard.md`	执行后质量报告

Cross-Skill Integration

跨技能集成

Skill	Relationship
`seo-plan`	Import source: strategy import reads seo-plan output
`seo-content`	Quality check: E-E-A-T validation of generated content
`seo-schema`	Schema markup: Article, BreadcrumbList, ItemList for cluster pages
`seo-dataforseo`	Data source: SERP data when DataForSEO MCP is available
`seo-google`	Reporting: generate PDF report of cluster plan and scorecard

After cluster planning or execution completes, offer: "Generate a PDF report? Use

/seo google report

技能	关系
`seo-plan`	导入源：策略导入功能读取seo-plan的输出结果
`seo-content`	质量检查：对生成内容进行E-E-A-T验证
`seo-schema`	结构化数据：为集群页面生成Article、BreadcrumbList、ItemList类型的Schema标记
`seo-dataforseo`	数据源：当DataForSEO MCP可用时提供SERP数据
`seo-google`	报告生成：生成集群规划和评分卡的PDF报告

集群规划或执行完成后，向用户提供选项： "是否生成PDF报告？使用

/seo google report

命令"

Error Handling

错误处理

Error	Cause	Resolution
"No seed keyword provided"	Missing argument	Prompt user for seed keyword or URL
"Insufficient keyword variants"	Expansion yielded < 15 keywords	Run second expansion pass with PAA questions
"SERP data unavailable"	WebSearch and DataForSEO both failing	Retry after 30s; if persistent, use intent-only clustering with warning
"No strategy file found"	`--from strategy` but no plan exists	Prompt user to run `/seo plan` first
"cluster-plan.json not found"	Execute without planning	Prompt user to run `/seo cluster plan` first
"claude-blog not installed"	Execute attempted without blog skill	Generate content briefs instead; suggest installation
"DataForSEO budget exceeded"	Cost check returned "blocked"	Fall back to WebSearch; inform user
"Duplicate primary keywords"	Cannibalization detected	Merge affected posts or reassign keywords
"Orphan page detected"	Post missing incoming links	Add links from nearest cluster siblings
"Resume state corrupted"	Mismatch between plan and output	Rebuild state from output directory scan

错误信息	原因	解决方法
"未提供种子关键词"	缺少参数	提示用户提供种子关键词或URL
"关键词变体数量不足"	拓展后关键词<15个	使用PAA问题进行第二轮拓展
"SERP数据不可用"	WebSearch和DataForSEO均失败	30秒后重试；若持续失败，使用仅基于意图的聚类并给出警告
"未找到策略文件"	使用 `--from strategy` 但无规划文件存在	提示用户先运行 `/seo plan`
"未找到cluster-plan.json"	未进行规划直接执行	提示用户先运行 `/seo cluster plan`
"未安装claude-blog"	未安装博客技能却尝试执行	改为生成内容简报；建议安装claude-blog
"DataForSEO预算不足"	成本检查返回"blocked"	Fallback到WebSearch；告知用户
"核心关键词重复"	检测到关键词自竞争	合并受影响的页面或重新分配关键词
"检测到孤立页面"	页面缺少入链	从最近的集群兄弟页面添加链接
"恢复状态损坏"	规划与输出目录不匹配	重新扫描输出目录重建状态

Security

安全说明

All URLs fetched via

python scripts/fetch_page.py

(SSRF protection via

validate_url()

)

No credentials stored or transmitted
Output files contain no PII or API keys
DataForSEO cost checks run before every API call

所有URL通过
```
python scripts/fetch_page.py
```
获取（通过
```
validate_url()
```
防止SSRF攻击）
不存储或传输任何凭证
输出文件不包含PII或API密钥
每次API调用前均进行DataForSEO成本检查",