blog-taxonomy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Blog Taxonomy

博客分类体系

Manage tags, categories, and topic clusters across CMS platforms.
跨CMS平台管理标签、分类及主题集群。

Commands

命令

CommandPurpose
/blog taxonomy suggest <file>
Extract candidate tags and categories from content
/blog taxonomy sync <cms>
Push taxonomy to CMS via authenticated API
/blog taxonomy audit [directory]
Check for thin tags, orphan tags, taxonomy bloat
命令用途
/blog taxonomy suggest <file>
从内容中提取候选标签与分类
/blog taxonomy sync <cms>
通过已认证API将分类体系推送至CMS
/blog taxonomy audit [directory]
检查单薄标签、孤立标签、分类体系冗余问题

Tag Suggestion Workflow

标签建议工作流程

Step 1: Parse Content Structure

步骤1:解析内容结构

Read the target file and extract:
  • All H2 and H3 headings (primary topic signals)
  • Bold and italic phrases (emphasis signals)
  • Existing frontmatter tags/categories if present
读取目标文件并提取:
  • 所有H2和H3标题(核心主题信号)
  • 加粗和斜体短语(重点强调信号)
  • 若存在前置元数据中的现有标签/分类

Step 2: Frequency Analysis

步骤2:频率分析

Scan the body text for high-frequency phrases:
  • 1-word terms: minimum 4 occurrences (excluding stop words)
  • 2-word phrases: minimum 3 occurrences
  • 3-word phrases: minimum 2 occurrences
Exclude common non-tag words: articles, prepositions, conjunctions, pronouns.
扫描正文中的高频短语:
  • 单字词:至少出现4次(排除停用词)
  • 双字词组:至少出现3次
  • 三字词组:至少出现2次
排除常见非标签词汇:冠词、介词、连词、代词。

Step 3: Semantic Grouping

步骤3:语义分组

Group related candidates into clusters:
  • Merge singular/plural variants (keep the more common form)
  • Merge hyphenated and non-hyphenated forms
  • Group synonyms under the highest-frequency term
将相关候选词聚类:
  • 合并单复数变体(保留更常用的形式)
  • 合并带连字符与不带连字符的形式
  • 将同义词归到出现频率最高的词汇下

Step 4: Deduplicate and Rank

步骤4:去重与排序

  • Fuzzy match on slugified names (Levenshtein distance <= 2)
  • Score each candidate:
    (frequency * 2) + (heading_presence * 5) + (emphasis * 1)
  • Return top 5-10 ranked suggestions
  • 对slug化后的名称进行模糊匹配(编辑距离≤2)
  • 为每个候选词打分:
    (出现频率 * 2) + (标题出现次数 * 5) + (强调标记 * 1)
  • 返回排名前5-10的建议

Output Format

输出格式

undefined
undefined

Tag Suggestions: [Post Title]

标签建议:[文章标题]

RankTagScoreSource
1content-marketing18H2 + 6 mentions
2seo-strategy14H3 + 4 mentions
3keyword-research115 mentions + bold
排名标签分数来源
1content-marketing18H2 + 6次提及
2seo-strategy14H3 + 4次提及
3keyword-research115次提及 + 加粗

Suggested Categories

建议分类

  • Primary: [best-fit category]
  • Secondary: [optional second category]
undefined
  • 主分类:[最适配分类]
  • 副分类:[可选二级分类]
undefined

CMS Adapters

CMS适配器

Adapter Overview

适配器概览

CMSAPI TypeAuth MethodTags Model
WordPressRESTApplication Passwords (base64)First-class entities with IDs
ShopifyGraphQL (Admin API)Admin API access tokenString array on Article
GhostREST (Admin API)API key with JWT signingFirst-class entities
StrapiREST or GraphQLAPI token (Bearer)User-defined content type
SanityGROQ / MutationsProject token (Bearer)Document type
CMSAPI类型认证方式标签模型
WordPressREST应用密码(base64编码)带ID的一等实体
ShopifyGraphQL(Admin API)Admin API访问令牌Article对象上的字符串数组
GhostREST(Admin API)带JWT签名的API密钥一等实体
StrapiREST或GraphQLAPI令牌(Bearer)用户自定义内容类型
SanityGROQ / 突变API项目令牌(Bearer)文档类型

WordPress Adapter

WordPress适配器

List tags:
GET {CMS_URL}/wp-json/wp/v2/tags?per_page=100&search={keyword}
Authorization: Basic {base64(username:app_password)}
Create tag:
POST {CMS_URL}/wp-json/wp/v2/tags
Body: {"name": "Tag Name", "slug": "tag-name", "description": "Optional"}
List categories (hierarchical, supports parent field):
GET {CMS_URL}/wp-json/wp/v2/categories?per_page=100
Create category:
POST {CMS_URL}/wp-json/wp/v2/categories
Body: {"name": "Category", "slug": "category", "parent": 0}
Assign tags to post:
POST {CMS_URL}/wp-json/wp/v2/posts/{id}
Body: {"tags": [1, 2, 3], "categories": [4]}
Pagination: follow
X-WP-TotalPages
header for full listing.
列出标签:
GET {CMS_URL}/wp-json/wp/v2/tags?per_page=100&search={keyword}
Authorization: Basic {base64(username:app_password)}
创建标签:
POST {CMS_URL}/wp-json/wp/v2/tags
Body: {"name": "Tag Name", "slug": "tag-name", "description": "Optional"}
列出分类(层级结构,支持父级字段):
GET {CMS_URL}/wp-json/wp/v2/categories?per_page=100
创建分类:
POST {CMS_URL}/wp-json/wp/v2/categories
Body: {"name": "Category", "slug": "category", "parent": 0}
为文章分配标签:
POST {CMS_URL}/wp-json/wp/v2/posts/{id}
Body: {"tags": [1, 2, 3], "categories": [4]}
分页:遵循
X-WP-TotalPages
标头获取完整列表。

Shopify Adapter

Shopify适配器

Tags on Shopify are string arrays on the Article object, not first-class entities.
Update article tags (GraphQL Admin API):
graphql
mutation {
  articleUpdate(id: "gid://shopify/Article/123", article: {
    tags: ["tag-one", "tag-two", "tag-three"]
  }) {
    article { id tags }
    userErrors { field message }
  }
}
List all tags in use (GraphQL):
graphql
{
  articles(first: 250) {
    edges {
      node { id title tags }
    }
  }
}
Auth header:
X-Shopify-Access-Token: {token}
Note: REST API marked legacy Oct 2024. GraphQL required for new apps since Apr 2025.
Shopify的标签是Article对象上的字符串数组,并非一等实体。
更新文章标签(GraphQL Admin API):
graphql
mutation {
  articleUpdate(id: "gid://shopify/Article/123", article: {
    tags: ["tag-one", "tag-two", "tag-three"]
  }) {
    article { id tags }
    userErrors { field message }
  }
}
列出所有已使用的标签(GraphQL):
graphql
{
  articles(first: 250) {
    edges {
      node { id title tags }
    }
  }
}
认证标头:
X-Shopify-Access-Token: {token}
注意:REST API于2024年10月标记为遗留版本。自2025年4月起,新应用需使用GraphQL。

Ghost Adapter

Ghost适配器

List tags:
GET {CMS_URL}/ghost/api/admin/tags/?limit=all
Authorization: Ghost {jwt_token}
Create tag:
POST {CMS_URL}/ghost/api/admin/tags/
Body: {"tags": [{"name": "Tag Name", "slug": "tag-name"}]}
JWT generation: sign with admin API key (id:secret format), iat = now, exp = 5 min, audience =
/admin/
.
列出标签:
GET {CMS_URL}/ghost/api/admin/tags/?limit=all
Authorization: Ghost {jwt_token}
创建标签:
POST {CMS_URL}/ghost/api/admin/tags/
Body: {"tags": [{"name": "Tag Name", "slug": "tag-name"}]}
JWT生成:使用管理员API密钥(id:secret格式)签名,iat=当前时间,exp=5分钟,audience=
/admin/

Strapi Adapter

Strapi适配器

Endpoint auto-generated from content types. Typical setup:
GET {CMS_URL}/api/tags?pagination[pageSize]=100
POST {CMS_URL}/api/tags
Body: {"data": {"name": "Tag Name", "slug": "tag-name"}}
Authorization: Bearer {api_token}
Strapi v4+ uses the
data
wrapper. Check your content type schema for field names.
端点由内容类型自动生成。典型配置:
GET {CMS_URL}/api/tags?pagination[pageSize]=100
POST {CMS_URL}/api/tags
Body: {"data": {"name": "Tag Name", "slug": "tag-name"}}
Authorization: Bearer {api_token}
Strapi v4+使用
data
包装器。请查看您的内容类型架构以确认字段名称。

Sanity Adapter

Sanity适配器

Query tags (GROQ):
*[_type == "tag"] { _id, name, slug }
Create tag (Mutations API):
POST https://{project_id}.api.sanity.io/v2024-01-01/data/mutate/{dataset}
Body: {"mutations": [{"create": {"_type": "tag", "name": "Tag", "slug": {"current": "tag"}}}]}
Authorization: Bearer {token}
查询标签(GROQ):
*[_type == "tag"] { _id, name, slug }
创建标签(突变API):
POST https://{project_id}.api.sanity.io/v2024-01-01/data/mutate/{dataset}
Body: {"mutations": [{"create": {"_type": "tag", "name": "Tag", "slug": {"current": "tag"}}}]}
Authorization: Bearer {token}

Taxonomy Audit Workflow

分类体系审计工作流程

Step 1: Inventory

步骤1:清单统计

Scan all posts in the target directory (or fetch from CMS). Build a map:
  • tag_name -> [list of post files/IDs using this tag]
  • category_name -> [list of post files/IDs]
扫描目标目录中的所有文章(或从CMS获取)。构建映射:
  • tag_name -> [使用该标签的文章文件/ID列表]
  • category_name -> [文章文件/ID列表]

Step 2: Health Checks

步骤2:健康检查

CheckThresholdAction
Thin tag archives< 5 posts per tagRecommend noindex or merge
Orphan tags0 postsRecommend deletion
Tag bloat> 50 total tagsRecommend consolidation
Category depth> 3 levelsRecommend flattening
Uncategorized postsNo category assignedAssign to appropriate category
Duplicate slugsSame slug, different nameMerge into canonical version
检查项阈值操作建议
单薄标签归档每个标签对应文章<5篇建议设置noindex或合并标签
孤立标签0篇文章使用建议删除
标签冗余总标签数>50建议合并精简
分类深度>3层级建议扁平化
未分类文章未分配分类分配至合适分类
重复slugslug相同但名称不同合并为标准版本

Step 3: Recommendations

步骤3:建议

Group findings by priority:
  • Critical: orphan tags creating empty archive pages (crawl waste)
  • High: thin tags with < 3 posts (poor user experience, weak SEO signal)
  • Medium: tag bloat over 50 (diluted taxonomy, harder to navigate)
  • Low: naming inconsistencies (mixed case, hyphen vs space)
按优先级分组展示结果:
  • 严重:孤立标签会生成空归档页面(浪费爬虫资源)
  • 高优先级:对应文章<3篇的单薄标签(用户体验差,SEO信号弱)
  • 中优先级:标签数超过50的冗余问题(分类体系分散,导航难度大)
  • 低优先级:命名不一致问题(大小写混合、连字符与空格混用)

Output Format

输出格式

undefined
undefined

Taxonomy Audit: [Site/Directory]

分类体系审计:[站点/目录]

Total tags: [n] | Total categories: [n] Healthy: [n] | Thin: [n] | Orphan: [n]
总标签数: [n] | 总分类数: [n] 健康: [n] | 单薄: [n] | 孤立: [n]

Critical Issues

严重问题

  • [orphan tags list]
  • [孤立标签列表]

Recommendations

建议

  1. Merge [tag-a] and [tag-b] (same topic, [n] combined posts)
  2. Delete orphan tags: [list]
  3. Add noindex to tag archives with < 5 posts
undefined
  1. 合并[tag-a]与[tag-b](主题相同,合并后共[n]篇文章)
  2. 删除孤立标签:[列表]
  3. 为对应文章<5篇的标签归档设置noindex
undefined

Site-Wide Guidelines

全站规范

  • Aim for 5-10 main categories per site (broad topics)
  • Tags should have at least 5 posts before creating an archive page
  • Use consistent slug format: lowercase, hyphen-separated
  • Every post needs exactly 1 primary category
  • Tags per post: 3-8 recommended, never exceed 15
  • 每个站点目标设置5-10个主分类(宽泛主题)
  • 创建标签归档页面之前,该标签需对应至少5篇文章
  • 使用统一的slug格式:小写、连字符分隔
  • 每篇文章必须分配恰好1个主分类
  • 每篇文章的标签数:建议3-8个,切勿超过15个

Environment Variables

环境变量

VariablePurposeExample
CMS_TYPEPlatform identifierwordpress, shopify, ghost, strapi, sanity
CMS_URLBase URL of the CMShttps://example.com
CMS_API_KEYAuthentication credentialApplication password, API token, or key
These must be set in the shell environment. Never store credentials in files or commit them to version control. The skill reads them via
$CMS_TYPE
,
$CMS_URL
, and
$CMS_API_KEY
at runtime.
变量用途示例
CMS_TYPE平台标识符wordpress, shopify, ghost, strapi, sanity
CMS_URLCMS的基础URLhttps://example.com
CMS_API_KEY认证凭证应用密码、API令牌或密钥
这些变量必须在Shell环境中设置。切勿将凭证存储在文件中或提交至版本控制系统。本工具在运行时通过
$CMS_TYPE
$CMS_URL
$CMS_API_KEY
读取这些变量。

Error Handling

错误处理

  • Missing environment variables: If CMS_TYPE, CMS_URL, or CMS_API_KEY is unset, report which variable is missing and provide the expected format
  • Invalid credentials: If the CMS API returns 401/403, report "Authentication failed - check CMS_API_KEY" and do not retry
  • Connection timeouts: If the CMS endpoint is unreachable after 10 seconds, report the timeout and suggest checking CMS_URL
  • Duplicate tag slugs: If a tag already exists on the CMS, skip creation and note "Tag already exists: [name]"
  • Rate limits: If the CMS API returns 429, wait and retry once. Report if the limit persists
  • Unsupported CMS: If CMS_TYPE is not one of the 5 supported platforms, list the valid options and exit
  • 缺失环境变量:若CMS_TYPE、CMS_URL或CMS_API_KEY未设置,报告缺失的变量并提供预期格式
  • 无效凭证:若CMS API返回401/403,报告“认证失败 - 检查CMS_API_KEY”且不重试
  • 连接超时:若CMS端点10秒内无法访问,报告超时并建议检查CMS_URL
  • 重复标签slug:若CMS上已存在该标签,跳过创建并记录“标签已存在:[名称]”
  • 速率限制:若CMS API返回429,等待后重试一次。若限制持续则报告
  • 不支持的CMS:若CMS_TYPE不属于5个支持平台之一,列出有效选项并退出