seo-technical

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Technical SEO

Audit and fix the layer beneath the content: how search engines crawl, render, index, and trust a site. Stack-agnostic.

审计并修复内容之下的底层问题：搜索引擎如何抓取、渲染、索引和信任网站。与技术栈无关。

When to use

使用场景

Site-wide audit before or after a migration
Investigating indexing or ranking drops
Setting up SEO foundations on a new site
Auditing Core Web Vitals or page experience signals
Fixing crawl waste, redirect chains, or canonical issues
Setting up multilingual or multi-regional sites

网站迁移前后的全站审计
调查索引或排名下降问题
为新网站搭建SEO基础
审计Core Web Vitals或页面体验信号
修复抓取浪费、重定向链或规范标签问题
搭建多语言或多区域站点

When NOT to use

不适用场景

Single-page on-page optimization (use
```
seo-onpage
```
)
Keyword strategy or content planning (use
```
seo-keyword
```
)
Competitor backlink or SERP analysis (use
```
seo-competitor
```
)
Pure performance optimization without SEO context (use
```
performance-optimization
```
)

单页面的页面内优化（使用
```
seo-onpage
```
）
关键词策略或内容规划（使用
```
seo-keyword
```
）
竞品反向链接或SERP分析（使用
```
seo-competitor
```
）
脱离SEO语境的纯性能优化（使用
```
performance-optimization
```
）

Required inputs

必要输入

The site URL or staging URL
Access to (at minimum) view the rendered HTML, robots.txt, and sitemap
Ideally: search console access, server logs, and a crawler

If the site is large (10K+ URLs), confirm whether the audit is a full crawl or a sample.

网站URL或预发布环境URL
至少具备查看渲染后的HTML、robots.txt和sitemap的权限
理想情况：拥有Search Console权限、服务器日志和爬虫工具

如果网站规模较大（10K+个URL），请确认审计是全量抓取还是抽样抓取。

The framework: 6 layers

框架：6个层级

Technical SEO has six layers, stacked. A failure in a lower layer breaks everything above it.

Technical SEO包含6个层级，层层堆叠。底层出现问题会导致所有上层功能失效。

1. Crawlability

1. 可抓取性

Can search engines access the URLs?

robots.txt does not block important paths
No accidental
```
noindex
```
on indexable pages
No accidental
```
disallow
```
patterns blocking CSS or JS (rendering breaks)
Sitemap is present, returns 200, and lists canonical URLs only
Sitemap is referenced in robots.txt
No infinite spaces (faceted nav generating endless URLs)
Crawl budget is not wasted on low-value URLs

搜索引擎能否访问URL？

robots.txt未拦截重要路径
可索引页面未被意外设置
```
noindex
```
未通过
```
disallow
```
规则意外拦截CSS或JS（否则会破坏渲染）
存在sitemap，返回200状态码，且仅列出规范URL
sitemap已在robots.txt中引用
不存在无限空间（如分面导航生成无限URL）
抓取预算未浪费在低价值URL上

2. Indexability

2. 可索引性

Of crawlable URLs, which should be indexed?

One canonical URL per piece of content (no duplicates)
Canonical tags self-reference on canonical pages
```
noindex
```
on staging, search results, filter pages, thank-you pages, internal admin
No mixed signals (canonical pointing one way, sitemap another, internal links a third)
Pagination handled correctly (rel=next/prev is deprecated, but consistent canonicals matter)
Parameter handling deliberate (UTM, session IDs, sort orders)

在可抓取的URL中，哪些应该被索引？

每个内容块对应一个规范URL（无重复内容）
规范页面上的规范标签指向自身
预发布环境、搜索结果页、筛选页、感谢页、内部管理页设置
```
noindex
```
无混合信号（规范标签、sitemap、内部链接指向不一致）
分页处理正确（rel=next/prev已弃用，但规范标签一致性很重要）
参数处理明确（UTM、会话ID、排序规则）

3. Rendering

3. 渲染

Does the rendered HTML match what crawlers see?

Critical content visible without JavaScript (or properly server-rendered)
For SPAs: confirm Googlebot sees the rendered content (test with the URL Inspection tool)
No cloaking (showing different content to bots vs users)
Lazy-loaded content has proper loading attributes
Hydration errors do not strip content from the rendered DOM

渲染后的HTML是否与爬虫所见一致？

关键内容无需JavaScript即可显示（或已正确实现服务端渲染）
对于SPA：确认Googlebot能看到渲染后的内容（使用URL Inspection工具测试）
无伪装行为（向爬虫和用户展示不同内容）
懒加载内容具备正确的加载属性
hydration错误不会从渲染后的DOM中移除内容

4. Site architecture

4. 网站架构

Is the site structured for both users and crawlers?

Clear URL hierarchy that mirrors site structure
Important pages reachable in 3 clicks or fewer from the homepage
Internal linking distributes authority logically
Breadcrumb navigation present and marked up with schema
No orphan pages (pages with no internal links)
No redirect chains (one redirect max)
No 4xx errors on internally-linked URLs

网站结构是否同时适配用户和爬虫？

清晰的URL层级与网站结构匹配
重要页面从首页出发最多3次点击即可到达
内部链接合理分配权重
存在面包屑导航并使用Schema标记
无孤立页面（无内部链接指向的页面）
无重定向链（最多一次重定向）
内部链接指向的URL无4xx错误

5. Structured data and signals

5. 结构化数据与信号

Does the site speak crawler language?

Schema.org markup on appropriate page types
JSON-LD format (preferred over microdata)
Validates in the Rich Results Test
Organization or LocalBusiness schema on the homepage or about page
BreadcrumbList schema on nested pages
Author and publisher schema linked correctly on content pages
llms.txt present at the root (for AI crawlers, see
```
seo-aeo-geo
```
)

网站是否使用爬虫能理解的语言？

对应页面类型已添加Schema.org标记
使用JSON-LD格式（优先于微数据）
通过富结果测试验证有效性
首页或关于页添加Organization或LocalBusiness Schema
嵌套页面添加BreadcrumbList Schema
内容页正确关联作者和发布者Schema
根目录存在llms.txt（供AI爬虫使用，参见
```
seo-aeo-geo
```
）

6. Page experience and security

6. 页面体验与安全性

Does the site meet the page experience baseline?

HTTPS on all pages, no mixed content
HSTS header set
Core Web Vitals pass (LCP, INP, CLS within thresholds)
Mobile-friendly (responsive, no horizontal scroll, tap targets sized correctly)
No intrusive interstitials on mobile
Stable URL structure (no random URL changes between deploys)
404 pages return 404, not 200 with "page not found" content (soft 404)

网站是否符合页面体验基准？

所有页面使用HTTPS，无混合内容
设置HSTS标头
核心Web指标（Core Web Vitals）达标（LCP、INP、CLS在阈值范围内）
移动端友好（响应式布局、无横向滚动、点击目标尺寸合适）
移动端无侵入式弹窗
URL结构稳定（部署之间无随机URL变更）
404页面返回404状态码，而非返回200状态码并显示“页面未找到”内容（软404）

Workflow

工作流程

Define scope. Whole site, a subfolder, a migration check, or a specific issue.
Confirm access. What can you actually see (HTML, robots, sitemap, search console, server logs, staging)?
Crawl. Use a crawler to enumerate URLs and statuses. Sample if the site is huge.
Run the 6-layer framework. Score each, note specific issues with example URLs.
Cross-reference. Search console for what's actually indexed. Compare to sitemap and crawl output.
Prioritize. Critical (blocks indexing or causes traffic loss), Important (suboptimal), Nice-to-have (polish).
Write the report. Use the template in
```
references/audit-template.md
```
.

定义范围：全站、子文件夹、迁移检查或特定问题。
确认权限：实际可访问的资源（HTML、robots.txt、sitemap、Search Console、服务器日志、预发布环境）。
抓取：使用爬虫枚举URL和状态码。若网站规模过大则抽样抓取。
执行6层级框架：为每个层级评分，记录具体问题及示例URL。
交叉验证：通过Search Console查看实际已索引内容，与sitemap和抓取结果对比。
优先级排序：关键问题（阻碍索引或导致流量损失）、重要问题（非最优）、优化建议（锦上添花）。
撰写报告：使用
```
references/audit-template.md
```
中的模板。

Failure patterns

常见错误模式

Optimizing rankings on a page that is
noindex
. Always check indexability before content work.
Adding sitemaps without fixing canonical issues. A sitemap of duplicate URLs is worse than no sitemap.
Blocking crawlers from CSS or JS. Breaks Google's rendering. Common in over-aggressive robots.txt files.
Over-relying on canonical tags. Canonicals are hints, not directives. Use redirects when content actually moved.
Migrating without a redirect map. Single biggest cause of post-migration traffic loss.
Treating Core Web Vitals as the only ranking signal. Page experience matters but does not override relevance.

为设置了
noindex
的页面优化排名：在进行内容优化前务必先检查可索引性。
未修复规范标签问题就添加sitemap：包含重复URL的sitemap比没有sitemap更糟糕。
拦截爬虫访问CSS或JS：会破坏Google的渲染。常见于过度严格的robots.txt文件。
过度依赖规范标签：规范标签是提示而非指令。当内容实际迁移时应使用重定向。
迁移时未准备重定向映射：这是迁移后流量损失的最大诱因。
将Core Web Vitals视为唯一排名信号：页面体验很重要，但不会覆盖相关性。

Output format

输出格式

Default output is a markdown audit at

seo-technical-audit.md

. Structure:

Scope and methodology
Executive summary (3 to 5 critical findings)
6-layer score
Critical issues (with example URLs)
Important issues
Nice-to-have polish
Implementation roadmap (sequenced)

For migrations, include a redirect map as a CSV alongside the report.

默认输出为Markdown格式的审计报告，路径为

seo-technical-audit.md

。结构如下：

范围与方法论
执行摘要（3-5个关键发现）
6层级评分
关键问题（含示例URL）
重要问题
优化建议
实施路线图（按顺序排列）

对于迁移场景，需在报告旁附带CSV格式的重定向映射表。

Reference files

参考文件

```
references/audit-template.md
```
- Fillable technical SEO audit template.
```
references/migration-checklist.md
```
- Pre and post-migration checklist (covers the highest-risk scenario).

```
references/audit-template.md
```
- 可填写的Technical SEO审计模板。
```
references/migration-checklist.md
```
- 迁移前后检查清单（涵盖最高风险场景）。