seo-cluster
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSemantic Topic Clustering (v1.9.0)
语义主题聚类(v1.9.0)
SERP-overlap-driven keyword clustering for content architecture. Groups keywords
by how Google actually ranks them (shared top-10 results), not by text similarity.
Designs hub-and-spoke content clusters with internal link matrices and generates
interactive cluster map visualizations.
Scripts: Located at the plugin root directory.
scripts/基于SERP重叠的关键词聚类,用于内容架构规划。根据Google的实际排名情况(共享的前10条结果)对关键词进行分组,而非基于文本相似度。设计带有内部链接矩阵的中心辐射式内容集群,并生成交互式集群地图可视化效果。
脚本文件:位于插件根目录的文件夹中。
scripts/Quick Reference
快速参考
| Command | What it does |
|---|---|
| Full planning workflow: expand, cluster, architect, visualize |
| Import from existing |
| Execute plan: create content via claude-blog or output briefs |
| Regenerate the interactive cluster visualization |
| 命令 | 功能 |
|---|---|
| 完整规划工作流:关键词拓展、聚类、架构设计、可视化 |
| 从现有 |
| 执行规划:通过claude-blog创建内容或生成内容简报 |
| 重新生成交互式集群可视化地图 |
Planning Workflow
规划工作流
Step 1: Seed Keyword Expansion
步骤1:种子关键词拓展
Expand the seed keyword into 30-50 variants using WebSearch:
- Related searches — Search the seed, extract "related searches" and "people also search for"
- People Also Ask (PAA) — Extract all PAA questions from SERP results
- Long-tail modifiers — Append common modifiers: "best", "how to", "vs", "for beginners", "tools", "examples", "guide", "template", "mistakes", "checklist"
- Question mining — Generate who/what/when/where/why/how variants
- Intent modifiers — Add commercial modifiers: "pricing", "review", "alternative", "comparison", "free", "top"
Deduplication: Normalize variants (lowercase, strip articles), remove exact duplicates.
Target: 30-50 unique keyword variants. If under 30, run a second expansion pass
with the top PAA questions as seeds.
通过WebSearch将种子关键词拓展为30-50个变体:
- 相关搜索 — 搜索种子关键词,提取「相关搜索」和「用户还搜索了」内容
- People Also Ask(PAA) — 从SERP结果中提取所有PAA问题
- 长尾修饰词 — 添加常见修饰词:「best」「how to」「vs」「for beginners」「tools」「examples」「guide」「template」「mistakes」「checklist」
- 问题挖掘 — 生成who/what/when/where/why/how类型的变体
- 意图修饰词 — 添加商业类修饰词:「pricing」「review」「alternative」「comparison」「free」「top」
去重处理:标准化关键词变体(转为小写、去除冠词),移除完全重复的内容。目标:30-50个独特的关键词变体。若不足30个,以热门PAA问题为种子进行第二轮拓展。
Step 2: SERP Overlap Clustering
步骤2:SERP重叠聚类
This is the core differentiator. Load for
the full algorithm.
references/serp-overlap-methodology.mdProcess:
- Group keywords by initial intent guess (reduces pairwise comparisons)
- For each candidate pair within a group, WebSearch both keywords
- Count shared URLs in the top 10 organic results (ignore ads, featured snippets, PAA)
- Apply thresholds:
| Shared Results | Relationship | Action |
|---|---|---|
| 7-10 | Same post | Merge into single target page |
| 4-6 | Same cluster | Group under same spoke cluster |
| 2-3 | Interlink | Place in adjacent clusters, add cross-links |
| 0-1 | Separate | Assign to different clusters or exclude |
Optimization: With 40 keywords, full pairwise = 780 comparisons. Instead:
- Pre-group by intent (4 groups of ~10 = 4 x 45 = 180 comparisons)
- Only cross-check group boundary keywords
- Skip pairs where both are long-tail variants of the same head term (assume same cluster)
DataForSEO integration: If DataForSEO MCP is available, use
instead of WebSearch for SERP data. Run
before each batch. If , show cost estimate and ask user.
If , fall back to WebSearch.
serp_organic_live_advancedpython scripts/dataforseo_costs.py check serp_organic_live_advanced --count N"status": "needs_approval""status": "blocked"这是核心差异化功能。查看获取完整算法说明。
references/serp-overlap-methodology.md处理流程:
- 根据初步意图猜测对关键词进行分组(减少两两对比次数)
- 对每组内的候选关键词对,分别进行WebSearch
- 统计前10条自然搜索结果中共享的URL数量(忽略广告、精选摘要、PAA)
- 应用以下阈值:
| 共享结果数量 | 关系 | 操作 |
|---|---|---|
| 7-10 | 同一页面 | 合并为单个目标页面 |
| 4-6 | 同一集群 | 归为同一分支集群 |
| 2-3 | 互链关联 | 放入相邻集群并添加交叉链接 |
| 0-1 | 相互独立 | 分配至不同集群或排除 |
优化方案:若有40个关键词,完整两两对比需780次。可通过以下方式优化:
- 先按意图预分组(4组各约10个关键词 = 4×45=180次对比)
- 仅交叉检查组边界的关键词
- 跳过属于同一核心词的长尾变体对(默认归为同一集群)
DataForSEO集成:若DataForSEO MCP可用,使用接口替代WebSearch获取SERP数据。每次批量处理前运行。若返回,则展示成本估算并询问用户;若返回,则 fallback 到WebSearch。
serp_organic_live_advancedpython scripts/dataforseo_costs.py check serp_organic_live_advanced --count N"status": "needs_approval""status": "blocked"Step 3: Intent Classification
步骤3:意图分类
Classify each keyword into one of four intent categories:
| Intent | Signals | Include in Clusters? |
|---|---|---|
| Informational | how, what, why, guide, tutorial, learn | Yes |
| Commercial | best, top, review, comparison, vs, alternative | Yes |
| Transactional | buy, price, discount, coupon, order, sign up | Yes |
| Navigational | brand names, specific product names, login | No (exclude) |
Remove navigational keywords from clustering. Flag borderline cases for
manual review. Keywords can have mixed intent (e.g., "best CRM software" is
both commercial and informational) -- classify by dominant intent.
将每个关键词归类为以下四种意图之一:
| 意图类型 | 识别信号 | 是否纳入聚类? |
|---|---|---|
| 信息型 | how、what、why、guide、tutorial、learn | 是 |
| 商业型 | best、top、review、comparison、vs、alternative | 是 |
| 交易型 | buy、price、discount、coupon、order、sign up | 是 |
| 导航型 | 品牌名、具体产品名、login | 否(排除) |
从聚类中移除导航型关键词。标记边界案例供人工审核。关键词可能存在混合意图(例如「best CRM software」同时属于商业型和信息型)——按主导意图分类。
Step 4: Hub-and-Spoke Architecture
步骤4:中心辐射式架构设计
Load for full specifications.
references/hub-spoke-architecture.mdDesign the cluster structure:
- Select the pillar keyword — Highest volume, broadest intent, most SERP overlap with other keywords
- Group spokes into clusters — Each cluster is a subtopic area (2-5 clusters per pillar)
- Assign posts to clusters — Each cluster gets 2-4 spoke posts
- Select templates per post — Based on intent classification:
| Intent Pattern | Template Options |
|---|---|
| Informational (broad) | ultimate-guide |
| Informational (how) | how-to |
| Informational (list) | listicle |
| Informational (concept) | explainer |
| Commercial (compare) | comparison |
| Commercial (evaluate) | review |
| Commercial (rank) | best-of |
| Transactional | landing-page |
-
Set word count targets:
- Pillar page: 2500-4000 words
- Spoke posts: 1200-1800 words
-
Cannibalization check — No two posts share the same primary keyword. If SERP overlap is 7+, merge those keywords into a single post targeting both.
查看获取完整规范。
references/hub-spoke-architecture.md设计集群结构:
- 选择支柱关键词 — 搜索量最高、意图最宽泛、与其他关键词SERP重叠度最高的关键词
- 将分支归为集群 — 每个集群对应一个子主题领域(每个支柱对应2-5个集群)
- 为集群分配页面 — 每个集群包含2-4个分支页面
- 为每个页面选择模板 — 根据意图分类选择:
| 意图模式 | 模板选项 |
|---|---|
| 信息型(宽泛) | ultimate-guide(终极指南) |
| 信息型(操作类) | how-to(操作教程) |
| 信息型(列表类) | listicle(清单文) |
| 信息型(概念类) | explainer(概念解析) |
| 商业型(对比类) | comparison(对比评测) |
| 商业型(评估类) | review(产品评测) |
| 商业型(排名类) | best-of(最佳榜单) |
| 交易型 | landing-page(落地页) |
-
设置字数目标:
- 支柱页面:2500-4000字
- 分支页面:1200-1800字
-
关键词自竞争检查 — 任意两个页面不得共享核心关键词。若SERP重叠度为7+,则将这些关键词合并为单个页面,同时针对两个关键词进行优化。
Step 5: Internal Link Matrix
步骤5:内部链接矩阵
Design the bidirectional linking structure:
| Link Type | Direction | Requirement |
|---|---|---|
| Spoke to pillar | spoke -> pillar | Mandatory (every spoke) |
| Pillar to spoke | pillar -> spoke | Mandatory (every spoke) |
| Spoke to spoke (within cluster) | spoke <-> spoke | 2-3 links per post |
| Cross-cluster | spoke -> spoke (other cluster) | 0-1 links per post |
Rules:
- Every post must have minimum 3 incoming internal links
- No orphan pages (every post reachable from pillar in 2 clicks)
- Anchor text must use target keyword or close variant (no "click here")
- Link placement: within body content, not just navigation/sidebar
Generate the link matrix as a JSON adjacency list:
json
{
"links": [
{ "from": "pillar", "to": "cluster-0-post-0", "type": "mandatory", "anchor": "keyword" },
{ "from": "cluster-0-post-0", "to": "pillar", "type": "mandatory", "anchor": "keyword" }
]
}设计双向链接结构:
| 链接类型 | 方向 | 要求 |
|---|---|---|
| 分支到支柱 | 分支 → 支柱 | 强制要求(每个分支页面都需添加) |
| 支柱到分支 | 支柱 → 分支 | 强制要求(每个分支页面都需添加) |
| 集群内分支互链 | 分支 ↔ 分支 | 每个页面添加2-3条链接 |
| 跨集群链接 | 分支 → 其他集群分支 | 每个页面添加0-1条链接 |
规则:
- 每个页面至少包含3条内部入链
- 无孤立页面(所有页面都能从支柱页面通过2次点击到达)
- 锚文本需使用目标关键词或近似变体(禁止使用「点击这里」)
- 链接需放置在正文中,而非仅在导航栏/侧边栏
生成JSON邻接列表格式的链接矩阵:
json
{
"links": [
{ "from": "pillar", "to": "cluster-0-post-0", "type": "mandatory", "anchor": "keyword" },
{ "from": "cluster-0-post-0", "to": "pillar", "type": "mandatory", "anchor": "keyword" }
]
}Step 6: Interactive Cluster Map
步骤6:交互式集群地图
Generate using the template at .
cluster-map.htmltemplates/cluster-map.html- Read the template file
- Build the JSON object from the cluster plan:
CLUSTER_DATAjavascript{ pillar: { title, keyword, volume, template, wordCount, url }, clusters: [{ name, color, posts: [{ title, keyword, volume, template, wordCount, url, status }] }], links: [{ from, to, type }], meta: { totalPosts, totalClusters, totalLinks, estimatedWords } } - Replace the placeholder in the template with the actual JSON
CLUSTER_DATA - Write the completed HTML file to the output directory
- Inform user: "Open in a browser to explore the interactive cluster map."
cluster-map.html
使用模板生成文件。
templates/cluster-map.htmlcluster-map.html- 读取模板文件
- 根据集群规划构建JSON对象:
CLUSTER_DATAjavascript{ pillar: { title, keyword, volume, template, wordCount, url }, clusters: [{ name, color, posts: [{ title, keyword, volume, template, wordCount, url, status }] }], links: [{ from, to, type }], meta: { totalPosts, totalClusters, totalLinks, estimatedWords } } - 将模板中的占位符替换为实际JSON数据
CLUSTER_DATA - 将完成的HTML文件写入输出目录
- 告知用户:「在浏览器中打开即可浏览交互式集群地图。」
cluster-map.html
Strategy Import
策略导入
When invoked with :
--from strategy- Look for the most recent output in the current directory (search for files matching
/seo plan,*SEO*Plan*,*strategy*)*content-strategy* - Parse markdown tables for: keywords, page types, content pillars, URL structures
- Validate extracted data: check for duplicates, missing keywords, incomplete entries
- Enrich with SERP data: run SERP overlap analysis on extracted keywords
- Build cluster plan using the imported keywords as the starting set (skip Step 1)
If no strategy file is found, prompt the user: "No existing SEO plan found in the
current directory. Run first, or provide a seed keyword for fresh clustering."
/seo plan当使用参数调用时:
--from strategy- 在当前目录中查找最新的输出结果(搜索匹配
/seo plan、*SEO*Plan*、*strategy*的文件)*content-strategy* - 解析markdown表格中的关键词、页面类型、内容支柱、URL结构
- 验证提取的数据:检查重复项、缺失关键词和不完整条目
- 补充SERP数据:对提取的关键词进行SERP重叠分析
- 以导入的关键词为起始集构建集群规划(跳过步骤1)
若未找到策略文件,提示用户:「当前目录中未找到现有SEO规划。请先运行,或提供种子关键词进行全新聚类。」
/seo planExecution Workflow
执行工作流
When is invoked:
/seo cluster execute当调用时:
/seo cluster executeCheck for claude-blog
检查claude-blog是否安装
Test: Does ~/.claude/skills/blog/SKILL.md exist?If claude-blog IS installed:
- Load for the full algorithm
references/execution-workflow.md - Read from the current directory
cluster-plan.json - Check for resume state: scan output directory for already-written posts
- Execute in priority order: pillar first, then spokes by volume (highest first)
- For each post, invoke the skill with cluster context:
blog-write- Cluster role (pillar or spoke)
- Position in cluster (cluster index, post index)
- Target keyword and secondary keywords
- Template type and word count target
- Internal links to include (with anchors)
- Links to receive from future posts (placeholder markers)
- After each post is written, scan previous posts for backward link placeholders and inject the new post's URL
- After all posts are written, generate the cluster scorecard
If claude-blog is NOT installed:
- Generate detailed content briefs for each post in the cluster plan
- Each brief includes:
- Title and meta description
- Primary keyword and secondary keywords
- Template type and suggested structure (H2/H3 outline)
- Word count target
- Internal links to include (with anchor text)
- Key points to cover
- Competing pages to differentiate from
- Write briefs to directory as individual markdown files
cluster-briefs/ - Inform user: "Install claude-blog
to auto-create content. Briefs saved to ."
cluster-briefs/
检测:~/.claude/skills/blog/SKILL.md 是否存在?若已安装claude-blog:
- 查看获取完整算法
references/execution-workflow.md - 从当前目录读取
cluster-plan.json - 检查恢复状态:扫描输出目录中已生成的页面
- 按优先级执行:先处理支柱页面,再按搜索量从高到低处理分支页面
- 对每个页面,结合集群上下文调用技能:
blog-write- 集群角色(支柱或分支)
- 在集群中的位置(集群索引、页面索引)
- 目标关键词和次要关键词
- 模板类型和字数目标
- 需包含的内部链接(带锚文本)
- 未来页面需添加的链接占位符
- 每个页面生成后,扫描之前的页面以查找反向链接占位符,并注入新页面的URL
- 所有页面生成完成后,生成集群评分卡
若未安装claude-blog:
- 为集群规划中的每个页面生成详细的内容简报
- 每份简报包含:
- 标题和元描述
- 核心关键词和次要关键词
- 模板类型和建议结构(H2/H3大纲)
- 字数目标
- 需包含的内部链接(带锚文本)
- 需覆盖的核心要点
- 需差异化的竞品页面
- 将简报写入目录,保存为独立的markdown文件
cluster-briefs/ - 告知用户:「安装claude-blog即可自动创建内容。简报已保存至。」
cluster-briefs/
Cluster Scorecard
集群评分卡
Post-execution quality report. Run automatically after or
on demand via analysis of the output directory.
/seo cluster execute| Metric | Target | How Measured |
|---|---|---|
| Coverage | 100% | Posts written / posts planned |
| Link Density | 3+ per post | Count internal links per post |
| Orphan Pages | 0 | Posts with < 1 incoming link |
| Cannibalization | 0 conflicts | Check for duplicate primary keywords |
| Image Count | 1+ per post | Posts with at least one image |
| Pillar Links | 100% | All spokes link to pillar and vice versa |
| Cross-Links | 80%+ | Recommended spoke-to-spoke links implemented |
| Content Gaps | 0 | Planned posts that were skipped or incomplete |
执行后的质量报告。在完成后自动生成,或通过分析输出目录按需生成。
/seo cluster execute| 指标 | 目标值 | 测量方式 |
|---|---|---|
| 覆盖度 | 100% | 已生成页面数 / 规划页面数 |
| 链接密度 | 每页≥3条 | 统计每个页面的内部链接数量 |
| 孤立页面 | 0 | 入链数<1的页面数量 |
| 关键词自竞争 | 0冲突 | 检查是否存在重复核心关键词 |
| 图片数量 | 每页≥1张 | 包含至少一张图片的页面数量 |
| 支柱链接 | 100% | 所有分支页面均与支柱页面互链 |
| 交叉链接 | ≥80% | 已实现的推荐分支间链接比例 |
| 内容缺口 | 0 | 被跳过或未完成的规划页面数量 |
Map Regeneration
地图重新生成
When is invoked:
/seo cluster map- Read from the current directory
cluster-plan.json - Scan output directory and update post statuses (planned vs written)
- Regenerate with updated statuses
cluster-map.html - Report: posts written vs planned, link completion percentage
当调用时:
/seo cluster map- 从当前目录读取
cluster-plan.json - 扫描输出目录并更新页面状态(规划中 vs 已生成)
- 使用更新后的状态重新生成
cluster-map.html - 报告:已生成页面数 vs 规划页面数、链接完成率
Output Files
输出文件
All outputs are written to the current working directory:
| File | Description |
|---|---|
| Machine-readable cluster plan (full data) |
| Human-readable cluster plan summary |
| Interactive SVG visualization |
| Content briefs (if no claude-blog) |
| Post-execution quality report |
所有输出文件均写入当前工作目录:
| 文件 | 描述 |
|---|---|
| 机器可读的完整集群规划数据 |
| 人类可读的集群规划摘要 |
| 交互式SVG可视化地图 |
| 内容简报目录(未安装claude-blog时生成) |
| 执行后质量报告 |
Cross-Skill Integration
跨技能集成
| Skill | Relationship |
|---|---|
| Import source: strategy import reads seo-plan output |
| Quality check: E-E-A-T validation of generated content |
| Schema markup: Article, BreadcrumbList, ItemList for cluster pages |
| Data source: SERP data when DataForSEO MCP is available |
| Reporting: generate PDF report of cluster plan and scorecard |
After cluster planning or execution completes, offer:
"Generate a PDF report? Use "
/seo google report| 技能 | 关系 |
|---|---|
| 导入源:策略导入功能读取seo-plan的输出结果 |
| 质量检查:对生成内容进行E-E-A-T验证 |
| 结构化数据:为集群页面生成Article、BreadcrumbList、ItemList类型的Schema标记 |
| 数据源:当DataForSEO MCP可用时提供SERP数据 |
| 报告生成:生成集群规划和评分卡的PDF报告 |
集群规划或执行完成后,向用户提供选项:
"是否生成PDF报告?使用命令"
/seo google reportError Handling
错误处理
| Error | Cause | Resolution |
|---|---|---|
| "No seed keyword provided" | Missing argument | Prompt user for seed keyword or URL |
| "Insufficient keyword variants" | Expansion yielded < 15 keywords | Run second expansion pass with PAA questions |
| "SERP data unavailable" | WebSearch and DataForSEO both failing | Retry after 30s; if persistent, use intent-only clustering with warning |
| "No strategy file found" | | Prompt user to run |
| "cluster-plan.json not found" | Execute without planning | Prompt user to run |
| "claude-blog not installed" | Execute attempted without blog skill | Generate content briefs instead; suggest installation |
| "DataForSEO budget exceeded" | Cost check returned "blocked" | Fall back to WebSearch; inform user |
| "Duplicate primary keywords" | Cannibalization detected | Merge affected posts or reassign keywords |
| "Orphan page detected" | Post missing incoming links | Add links from nearest cluster siblings |
| "Resume state corrupted" | Mismatch between plan and output | Rebuild state from output directory scan |
| 错误信息 | 原因 | 解决方法 |
|---|---|---|
| "未提供种子关键词" | 缺少参数 | 提示用户提供种子关键词或URL |
| "关键词变体数量不足" | 拓展后关键词<15个 | 使用PAA问题进行第二轮拓展 |
| "SERP数据不可用" | WebSearch和DataForSEO均失败 | 30秒后重试;若持续失败,使用仅基于意图的聚类并给出警告 |
| "未找到策略文件" | 使用 | 提示用户先运行 |
| "未找到cluster-plan.json" | 未进行规划直接执行 | 提示用户先运行 |
| "未安装claude-blog" | 未安装博客技能却尝试执行 | 改为生成内容简报;建议安装claude-blog |
| "DataForSEO预算不足" | 成本检查返回"blocked" | Fallback到WebSearch;告知用户 |
| "核心关键词重复" | 检测到关键词自竞争 | 合并受影响的页面或重新分配关键词 |
| "检测到孤立页面" | 页面缺少入链 | 从最近的集群兄弟页面添加链接 |
| "恢复状态损坏" | 规划与输出目录不匹配 | 重新扫描输出目录重建状态 |
Security
安全说明
- All URLs fetched via (SSRF protection via
python scripts/fetch_page.py)validate_url() - No credentials stored or transmitted
- Output files contain no PII or API keys
- DataForSEO cost checks run before every API call
- 所有URL通过获取(通过
python scripts/fetch_page.py防止SSRF攻击)validate_url() - 不存储或传输任何凭证
- 输出文件不包含PII或API密钥
- 每次API调用前均进行DataForSEO成本检查",