seo-technical
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTechnical SEO
Technical SEO
Audit and fix the layer beneath the content: how search engines crawl, render, index, and trust a site. Stack-agnostic.
审计并修复内容之下的底层问题:搜索引擎如何抓取、渲染、索引和信任网站。与技术栈无关。
When to use
使用场景
- Site-wide audit before or after a migration
- Investigating indexing or ranking drops
- Setting up SEO foundations on a new site
- Auditing Core Web Vitals or page experience signals
- Fixing crawl waste, redirect chains, or canonical issues
- Setting up multilingual or multi-regional sites
- 网站迁移前后的全站审计
- 调查索引或排名下降问题
- 为新网站搭建SEO基础
- 审计Core Web Vitals或页面体验信号
- 修复抓取浪费、重定向链或规范标签问题
- 搭建多语言或多区域站点
When NOT to use
不适用场景
- Single-page on-page optimization (use )
seo-onpage - Keyword strategy or content planning (use )
seo-keyword - Competitor backlink or SERP analysis (use )
seo-competitor - Pure performance optimization without SEO context (use )
performance-optimization
- 单页面的页面内优化(使用)
seo-onpage - 关键词策略或内容规划(使用)
seo-keyword - 竞品反向链接或SERP分析(使用)
seo-competitor - 脱离SEO语境的纯性能优化(使用)
performance-optimization
Required inputs
必要输入
- The site URL or staging URL
- Access to (at minimum) view the rendered HTML, robots.txt, and sitemap
- Ideally: search console access, server logs, and a crawler
If the site is large (10K+ URLs), confirm whether the audit is a full crawl or a sample.
- 网站URL或预发布环境URL
- 至少具备查看渲染后的HTML、robots.txt和sitemap的权限
- 理想情况:拥有Search Console权限、服务器日志和爬虫工具
如果网站规模较大(10K+个URL),请确认审计是全量抓取还是抽样抓取。
The framework: 6 layers
框架:6个层级
Technical SEO has six layers, stacked. A failure in a lower layer breaks everything above it.
Technical SEO包含6个层级,层层堆叠。底层出现问题会导致所有上层功能失效。
1. Crawlability
1. 可抓取性
Can search engines access the URLs?
- robots.txt does not block important paths
- No accidental on indexable pages
noindex - No accidental patterns blocking CSS or JS (rendering breaks)
disallow - Sitemap is present, returns 200, and lists canonical URLs only
- Sitemap is referenced in robots.txt
- No infinite spaces (faceted nav generating endless URLs)
- Crawl budget is not wasted on low-value URLs
搜索引擎能否访问URL?
- robots.txt未拦截重要路径
- 可索引页面未被意外设置
noindex - 未通过规则意外拦截CSS或JS(否则会破坏渲染)
disallow - 存在sitemap,返回200状态码,且仅列出规范URL
- sitemap已在robots.txt中引用
- 不存在无限空间(如分面导航生成无限URL)
- 抓取预算未浪费在低价值URL上
2. Indexability
2. 可索引性
Of crawlable URLs, which should be indexed?
- One canonical URL per piece of content (no duplicates)
- Canonical tags self-reference on canonical pages
- on staging, search results, filter pages, thank-you pages, internal admin
noindex - No mixed signals (canonical pointing one way, sitemap another, internal links a third)
- Pagination handled correctly (rel=next/prev is deprecated, but consistent canonicals matter)
- Parameter handling deliberate (UTM, session IDs, sort orders)
在可抓取的URL中,哪些应该被索引?
- 每个内容块对应一个规范URL(无重复内容)
- 规范页面上的规范标签指向自身
- 预发布环境、搜索结果页、筛选页、感谢页、内部管理页设置
noindex - 无混合信号(规范标签、sitemap、内部链接指向不一致)
- 分页处理正确(rel=next/prev已弃用,但规范标签一致性很重要)
- 参数处理明确(UTM、会话ID、排序规则)
3. Rendering
3. 渲染
Does the rendered HTML match what crawlers see?
- Critical content visible without JavaScript (or properly server-rendered)
- For SPAs: confirm Googlebot sees the rendered content (test with the URL Inspection tool)
- No cloaking (showing different content to bots vs users)
- Lazy-loaded content has proper loading attributes
- Hydration errors do not strip content from the rendered DOM
渲染后的HTML是否与爬虫所见一致?
- 关键内容无需JavaScript即可显示(或已正确实现服务端渲染)
- 对于SPA:确认Googlebot能看到渲染后的内容(使用URL Inspection工具测试)
- 无伪装行为(向爬虫和用户展示不同内容)
- 懒加载内容具备正确的加载属性
- hydration错误不会从渲染后的DOM中移除内容
4. Site architecture
4. 网站架构
Is the site structured for both users and crawlers?
- Clear URL hierarchy that mirrors site structure
- Important pages reachable in 3 clicks or fewer from the homepage
- Internal linking distributes authority logically
- Breadcrumb navigation present and marked up with schema
- No orphan pages (pages with no internal links)
- No redirect chains (one redirect max)
- No 4xx errors on internally-linked URLs
网站结构是否同时适配用户和爬虫?
- 清晰的URL层级与网站结构匹配
- 重要页面从首页出发最多3次点击即可到达
- 内部链接合理分配权重
- 存在面包屑导航并使用Schema标记
- 无孤立页面(无内部链接指向的页面)
- 无重定向链(最多一次重定向)
- 内部链接指向的URL无4xx错误
5. Structured data and signals
5. 结构化数据与信号
Does the site speak crawler language?
- Schema.org markup on appropriate page types
- JSON-LD format (preferred over microdata)
- Validates in the Rich Results Test
- Organization or LocalBusiness schema on the homepage or about page
- BreadcrumbList schema on nested pages
- Author and publisher schema linked correctly on content pages
- llms.txt present at the root (for AI crawlers, see )
seo-aeo-geo
网站是否使用爬虫能理解的语言?
- 对应页面类型已添加Schema.org标记
- 使用JSON-LD格式(优先于微数据)
- 通过富结果测试验证有效性
- 首页或关于页添加Organization或LocalBusiness Schema
- 嵌套页面添加BreadcrumbList Schema
- 内容页正确关联作者和发布者Schema
- 根目录存在llms.txt(供AI爬虫使用,参见)
seo-aeo-geo
6. Page experience and security
6. 页面体验与安全性
Does the site meet the page experience baseline?
- HTTPS on all pages, no mixed content
- HSTS header set
- Core Web Vitals pass (LCP, INP, CLS within thresholds)
- Mobile-friendly (responsive, no horizontal scroll, tap targets sized correctly)
- No intrusive interstitials on mobile
- Stable URL structure (no random URL changes between deploys)
- 404 pages return 404, not 200 with "page not found" content (soft 404)
网站是否符合页面体验基准?
- 所有页面使用HTTPS,无混合内容
- 设置HSTS标头
- 核心Web指标(Core Web Vitals)达标(LCP、INP、CLS在阈值范围内)
- 移动端友好(响应式布局、无横向滚动、点击目标尺寸合适)
- 移动端无侵入式弹窗
- URL结构稳定(部署之间无随机URL变更)
- 404页面返回404状态码,而非返回200状态码并显示“页面未找到”内容(软404)
Workflow
工作流程
- Define scope. Whole site, a subfolder, a migration check, or a specific issue.
- Confirm access. What can you actually see (HTML, robots, sitemap, search console, server logs, staging)?
- Crawl. Use a crawler to enumerate URLs and statuses. Sample if the site is huge.
- Run the 6-layer framework. Score each, note specific issues with example URLs.
- Cross-reference. Search console for what's actually indexed. Compare to sitemap and crawl output.
- Prioritize. Critical (blocks indexing or causes traffic loss), Important (suboptimal), Nice-to-have (polish).
- Write the report. Use the template in .
references/audit-template.md
- 定义范围:全站、子文件夹、迁移检查或特定问题。
- 确认权限:实际可访问的资源(HTML、robots.txt、sitemap、Search Console、服务器日志、预发布环境)。
- 抓取:使用爬虫枚举URL和状态码。若网站规模过大则抽样抓取。
- 执行6层级框架:为每个层级评分,记录具体问题及示例URL。
- 交叉验证:通过Search Console查看实际已索引内容,与sitemap和抓取结果对比。
- 优先级排序:关键问题(阻碍索引或导致流量损失)、重要问题(非最优)、优化建议(锦上添花)。
- 撰写报告:使用中的模板。
references/audit-template.md
Failure patterns
常见错误模式
- Optimizing rankings on a page that is . Always check indexability before content work.
noindex - Adding sitemaps without fixing canonical issues. A sitemap of duplicate URLs is worse than no sitemap.
- Blocking crawlers from CSS or JS. Breaks Google's rendering. Common in over-aggressive robots.txt files.
- Over-relying on canonical tags. Canonicals are hints, not directives. Use redirects when content actually moved.
- Migrating without a redirect map. Single biggest cause of post-migration traffic loss.
- Treating Core Web Vitals as the only ranking signal. Page experience matters but does not override relevance.
- 为设置了的页面优化排名:在进行内容优化前务必先检查可索引性。
noindex - 未修复规范标签问题就添加sitemap:包含重复URL的sitemap比没有sitemap更糟糕。
- 拦截爬虫访问CSS或JS:会破坏Google的渲染。常见于过度严格的robots.txt文件。
- 过度依赖规范标签:规范标签是提示而非指令。当内容实际迁移时应使用重定向。
- 迁移时未准备重定向映射:这是迁移后流量损失的最大诱因。
- 将Core Web Vitals视为唯一排名信号:页面体验很重要,但不会覆盖相关性。
Output format
输出格式
Default output is a markdown audit at . Structure:
seo-technical-audit.md- Scope and methodology
- Executive summary (3 to 5 critical findings)
- 6-layer score
- Critical issues (with example URLs)
- Important issues
- Nice-to-have polish
- Implementation roadmap (sequenced)
For migrations, include a redirect map as a CSV alongside the report.
默认输出为Markdown格式的审计报告,路径为。结构如下:
seo-technical-audit.md- 范围与方法论
- 执行摘要(3-5个关键发现)
- 6层级评分
- 关键问题(含示例URL)
- 重要问题
- 优化建议
- 实施路线图(按顺序排列)
对于迁移场景,需在报告旁附带CSV格式的重定向映射表。
Reference files
参考文件
- - Fillable technical SEO audit template.
references/audit-template.md - - Pre and post-migration checklist (covers the highest-risk scenario).
references/migration-checklist.md
- - 可填写的Technical SEO审计模板。
references/audit-template.md - - 迁移前后检查清单(涵盖最高风险场景)。
references/migration-checklist.md