blog-taxonomy

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Blog Taxonomy

博客分类体系

Manage tags, categories, and topic clusters across CMS platforms.

跨CMS平台管理标签、分类及主题集群。

Commands

命令

Command	Purpose
`/blog taxonomy suggest <file>`	Extract candidate tags and categories from content
`/blog taxonomy sync <cms>`	Push taxonomy to CMS via authenticated API
`/blog taxonomy audit [directory]`	Check for thin tags, orphan tags, taxonomy bloat

命令	用途
`/blog taxonomy suggest <file>`	从内容中提取候选标签与分类
`/blog taxonomy sync <cms>`	通过已认证API将分类体系推送至CMS
`/blog taxonomy audit [directory]`	检查单薄标签、孤立标签、分类体系冗余问题

Tag Suggestion Workflow

标签建议工作流程

Step 1: Parse Content Structure

步骤1：解析内容结构

Read the target file and extract:

All H2 and H3 headings (primary topic signals)
Bold and italic phrases (emphasis signals)
Existing frontmatter tags/categories if present

读取目标文件并提取：

所有H2和H3标题（核心主题信号）
加粗和斜体短语（重点强调信号）
若存在前置元数据中的现有标签/分类

Step 2: Frequency Analysis

步骤2：频率分析

Scan the body text for high-frequency phrases:

1-word terms: minimum 4 occurrences (excluding stop words)
2-word phrases: minimum 3 occurrences
3-word phrases: minimum 2 occurrences

Exclude common non-tag words: articles, prepositions, conjunctions, pronouns.

扫描正文中的高频短语：

单字词：至少出现4次（排除停用词）
双字词组：至少出现3次
三字词组：至少出现2次

排除常见非标签词汇：冠词、介词、连词、代词。

Step 3: Semantic Grouping

步骤3：语义分组

Group related candidates into clusters:

Merge singular/plural variants (keep the more common form)
Merge hyphenated and non-hyphenated forms
Group synonyms under the highest-frequency term

将相关候选词聚类：

合并单复数变体（保留更常用的形式）
合并带连字符与不带连字符的形式
将同义词归到出现频率最高的词汇下

Step 4: Deduplicate and Rank

步骤4：去重与排序

Fuzzy match on slugified names (Levenshtein distance <= 2)

Score each candidate:

(frequency * 2) + (heading_presence * 5) + (emphasis * 1)

Return top 5-10 ranked suggestions

对slug化后的名称进行模糊匹配（编辑距离≤2）

为每个候选词打分：

(出现频率 * 2) + (标题出现次数 * 5) + (强调标记 * 1)

返回排名前5-10的建议

Output Format

输出格式

undefined

undefined

Tag Suggestions: [Post Title]

标签建议：[文章标题]

Rank	Tag	Score	Source
1	content-marketing	18	H2 + 6 mentions
2	seo-strategy	14	H3 + 4 mentions
3	keyword-research	11	5 mentions + bold

排名	标签	分数	来源
1	content-marketing	18	H2 + 6次提及
2	seo-strategy	14	H3 + 4次提及
3	keyword-research	11	5次提及 + 加粗

Suggested Categories

建议分类

Primary: [best-fit category]
Secondary: [optional second category]

undefined

主分类：[最适配分类]
副分类：[可选二级分类]

undefined

CMS Adapters

CMS适配器

Adapter Overview

适配器概览

CMS	API Type	Auth Method	Tags Model
WordPress	REST	Application Passwords (base64)	First-class entities with IDs
Shopify	GraphQL (Admin API)	Admin API access token	String array on Article
Ghost	REST (Admin API)	API key with JWT signing	First-class entities
Strapi	REST or GraphQL	API token (Bearer)	User-defined content type
Sanity	GROQ / Mutations	Project token (Bearer)	Document type

CMS	API类型	认证方式	标签模型
WordPress	REST	应用密码（base64编码）	带ID的一等实体
Shopify	GraphQL（Admin API）	Admin API访问令牌	Article对象上的字符串数组
Ghost	REST（Admin API）	带JWT签名的API密钥	一等实体
Strapi	REST或GraphQL	API令牌（Bearer）	用户自定义内容类型
Sanity	GROQ / 突变API	项目令牌（Bearer）	文档类型

WordPress Adapter

WordPress适配器

List tags:

GET {CMS_URL}/wp-json/wp/v2/tags?per_page=100&search={keyword}
Authorization: Basic {base64(username:app_password)}

Create tag:

POST {CMS_URL}/wp-json/wp/v2/tags
Body: {"name": "Tag Name", "slug": "tag-name", "description": "Optional"}

List categories (hierarchical, supports parent field):

GET {CMS_URL}/wp-json/wp/v2/categories?per_page=100

Create category:

POST {CMS_URL}/wp-json/wp/v2/categories
Body: {"name": "Category", "slug": "category", "parent": 0}

Assign tags to post:

POST {CMS_URL}/wp-json/wp/v2/posts/{id}
Body: {"tags": [1, 2, 3], "categories": [4]}

Pagination: follow

X-WP-TotalPages

header for full listing.

列出标签:

GET {CMS_URL}/wp-json/wp/v2/tags?per_page=100&search={keyword}
Authorization: Basic {base64(username:app_password)}

创建标签:

POST {CMS_URL}/wp-json/wp/v2/tags
Body: {"name": "Tag Name", "slug": "tag-name", "description": "Optional"}

列出分类（层级结构，支持父级字段）:

GET {CMS_URL}/wp-json/wp/v2/categories?per_page=100

创建分类:

POST {CMS_URL}/wp-json/wp/v2/categories
Body: {"name": "Category", "slug": "category", "parent": 0}

为文章分配标签:

POST {CMS_URL}/wp-json/wp/v2/posts/{id}
Body: {"tags": [1, 2, 3], "categories": [4]}

分页：遵循

X-WP-TotalPages

标头获取完整列表。

Shopify Adapter

Shopify适配器

Tags on Shopify are string arrays on the Article object, not first-class entities.

Update article tags (GraphQL Admin API):

graphql

mutation {
  articleUpdate(id: "gid://shopify/Article/123", article: {
    tags: ["tag-one", "tag-two", "tag-three"]
  }) {
    article { id tags }
    userErrors { field message }
  }
}

List all tags in use (GraphQL):

graphql

{
  articles(first: 250) {
    edges {
      node { id title tags }
    }
  }
}

Auth header:

X-Shopify-Access-Token: {token}

Note: REST API marked legacy Oct 2024. GraphQL required for new apps since Apr 2025.

Shopify的标签是Article对象上的字符串数组，并非一等实体。

更新文章标签（GraphQL Admin API）:

graphql

mutation {
  articleUpdate(id: "gid://shopify/Article/123", article: {
    tags: ["tag-one", "tag-two", "tag-three"]
  }) {
    article { id tags }
    userErrors { field message }
  }
}

列出所有已使用的标签（GraphQL）:

graphql

{
  articles(first: 250) {
    edges {
      node { id title tags }
    }
  }
}

认证标头：

X-Shopify-Access-Token: {token}

注意：REST API于2024年10月标记为遗留版本。自2025年4月起，新应用需使用GraphQL。

Ghost Adapter

Ghost适配器

List tags:

GET {CMS_URL}/ghost/api/admin/tags/?limit=all
Authorization: Ghost {jwt_token}

Create tag:

POST {CMS_URL}/ghost/api/admin/tags/
Body: {"tags": [{"name": "Tag Name", "slug": "tag-name"}]}

JWT generation: sign with admin API key (id:secret format), iat = now, exp = 5 min, audience =

/admin/

列出标签:

GET {CMS_URL}/ghost/api/admin/tags/?limit=all
Authorization: Ghost {jwt_token}

创建标签:

POST {CMS_URL}/ghost/api/admin/tags/
Body: {"tags": [{"name": "Tag Name", "slug": "tag-name"}]}

JWT生成：使用管理员API密钥（id:secret格式）签名，iat=当前时间，exp=5分钟，audience=

/admin/

。

Strapi Adapter

Strapi适配器

Endpoint auto-generated from content types. Typical setup:

GET {CMS_URL}/api/tags?pagination[pageSize]=100
POST {CMS_URL}/api/tags
Body: {"data": {"name": "Tag Name", "slug": "tag-name"}}
Authorization: Bearer {api_token}

Strapi v4+ uses the

data

wrapper. Check your content type schema for field names.

端点由内容类型自动生成。典型配置：

GET {CMS_URL}/api/tags?pagination[pageSize]=100
POST {CMS_URL}/api/tags
Body: {"data": {"name": "Tag Name", "slug": "tag-name"}}
Authorization: Bearer {api_token}

Strapi v4+使用

data

包装器。请查看您的内容类型架构以确认字段名称。

Sanity Adapter

Sanity适配器

Query tags (GROQ):

*[_type == "tag"] { _id, name, slug }

Create tag (Mutations API):

POST https://{project_id}.api.sanity.io/v2024-01-01/data/mutate/{dataset}
Body: {"mutations": [{"create": {"_type": "tag", "name": "Tag", "slug": {"current": "tag"}}}]}
Authorization: Bearer {token}

查询标签（GROQ）:

*[_type == "tag"] { _id, name, slug }

创建标签（突变API）:

POST https://{project_id}.api.sanity.io/v2024-01-01/data/mutate/{dataset}
Body: {"mutations": [{"create": {"_type": "tag", "name": "Tag", "slug": {"current": "tag"}}}]}
Authorization: Bearer {token}

Taxonomy Audit Workflow

分类体系审计工作流程

Step 1: Inventory

步骤1：清单统计

Scan all posts in the target directory (or fetch from CMS). Build a map:

tag_name -> [list of post files/IDs using this tag]
category_name -> [list of post files/IDs]

扫描目标目录中的所有文章（或从CMS获取）。构建映射：

tag_name -> [使用该标签的文章文件/ID列表]
category_name -> [文章文件/ID列表]

Step 2: Health Checks

步骤2：健康检查

Check	Threshold	Action
Thin tag archives	< 5 posts per tag	Recommend noindex or merge
Orphan tags	0 posts	Recommend deletion
Tag bloat	> 50 total tags	Recommend consolidation
Category depth	> 3 levels	Recommend flattening
Uncategorized posts	No category assigned	Assign to appropriate category
Duplicate slugs	Same slug, different name	Merge into canonical version

检查项	阈值	操作建议
单薄标签归档	每个标签对应文章<5篇	建议设置noindex或合并标签
孤立标签	0篇文章使用	建议删除
标签冗余	总标签数>50	建议合并精简
分类深度	>3层级	建议扁平化
未分类文章	未分配分类	分配至合适分类
重复slug	slug相同但名称不同	合并为标准版本

Step 3: Recommendations

步骤3：建议

Group findings by priority:

Critical: orphan tags creating empty archive pages (crawl waste)
High: thin tags with < 3 posts (poor user experience, weak SEO signal)
Medium: tag bloat over 50 (diluted taxonomy, harder to navigate)
Low: naming inconsistencies (mixed case, hyphen vs space)

按优先级分组展示结果：

严重：孤立标签会生成空归档页面（浪费爬虫资源）
高优先级：对应文章<3篇的单薄标签（用户体验差，SEO信号弱）
中优先级：标签数超过50的冗余问题（分类体系分散，导航难度大）
低优先级：命名不一致问题（大小写混合、连字符与空格混用）

Output Format

输出格式

undefined

undefined

Taxonomy Audit: [Site/Directory]

分类体系审计：[站点/目录]

Total tags: [n] | Total categories: [n] Healthy: [n] | Thin: [n] | Orphan: [n]

总标签数: [n] | 总分类数: [n] 健康: [n] | 单薄: [n] | 孤立: [n]

Critical Issues

严重问题

[orphan tags list]

[孤立标签列表]

Recommendations

建议

Merge [tag-a] and [tag-b] (same topic, [n] combined posts)
Delete orphan tags: [list]
Add noindex to tag archives with < 5 posts

undefined

合并[tag-a]与[tag-b]（主题相同，合并后共[n]篇文章）
删除孤立标签：[列表]
为对应文章<5篇的标签归档设置noindex

undefined

Site-Wide Guidelines

全站规范

Aim for 5-10 main categories per site (broad topics)
Tags should have at least 5 posts before creating an archive page
Use consistent slug format: lowercase, hyphen-separated
Every post needs exactly 1 primary category
Tags per post: 3-8 recommended, never exceed 15

每个站点目标设置5-10个主分类（宽泛主题）
创建标签归档页面之前，该标签需对应至少5篇文章
使用统一的slug格式：小写、连字符分隔
每篇文章必须分配恰好1个主分类
每篇文章的标签数：建议3-8个，切勿超过15个

Environment Variables

环境变量

Variable	Purpose	Example
CMS_TYPE	Platform identifier	wordpress, shopify, ghost, strapi, sanity
CMS_URL	Base URL of the CMS	https://example.com
CMS_API_KEY	Authentication credential	Application password, API token, or key

These must be set in the shell environment. Never store credentials in files or commit them to version control. The skill reads them via

$CMS_TYPE

$CMS_URL

, and

$CMS_API_KEY

at runtime.

变量	用途	示例
CMS_TYPE	平台标识符	wordpress, shopify, ghost, strapi, sanity
CMS_URL	CMS的基础URL	https://example.com
CMS_API_KEY	认证凭证	应用密码、API令牌或密钥

这些变量必须在Shell环境中设置。切勿将凭证存储在文件中或提交至版本控制系统。本工具在运行时通过

$CMS_TYPE

、

$CMS_URL

和

$CMS_API_KEY

读取这些变量。

Error Handling

错误处理

Missing environment variables: If CMS_TYPE, CMS_URL, or CMS_API_KEY is unset, report which variable is missing and provide the expected format
Invalid credentials: If the CMS API returns 401/403, report "Authentication failed - check CMS_API_KEY" and do not retry
Connection timeouts: If the CMS endpoint is unreachable after 10 seconds, report the timeout and suggest checking CMS_URL
Duplicate tag slugs: If a tag already exists on the CMS, skip creation and note "Tag already exists: [name]"
Rate limits: If the CMS API returns 429, wait and retry once. Report if the limit persists
Unsupported CMS: If CMS_TYPE is not one of the 5 supported platforms, list the valid options and exit

缺失环境变量：若CMS_TYPE、CMS_URL或CMS_API_KEY未设置，报告缺失的变量并提供预期格式
无效凭证：若CMS API返回401/403，报告“认证失败 - 检查CMS_API_KEY”且不重试
连接超时：若CMS端点10秒内无法访问，报告超时并建议检查CMS_URL
重复标签slug：若CMS上已存在该标签，跳过创建并记录“标签已存在：[名称]”
速率限制：若CMS API返回429，等待后重试一次。若限制持续则报告
不支持的CMS：若CMS_TYPE不属于5个支持平台之一，列出有效选项并退出