geo-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGEO Review
GEO 评估
Evaluate how well your application and content are optimized for AI-powered search and answer engines — ChatGPT, Perplexity, Google AI Overviews, Claude, and other generative AI systems that cite web sources. Traditional SEO gets you ranked in a link list; GEO gets you cited in AI-generated answers.
评估你的应用和内容针对AI驱动搜索与问答引擎的优化程度——包括ChatGPT、Perplexity、Google AI Overviews、Claude以及其他会引用网络来源的生成式AI系统。传统SEO帮你在链接列表中获得排名;而GEO能让你在AI生成的答案中被引用。
When to use
适用场景
Use when:
/geo-review- Your product is discovered through AI assistants (developer tools, SaaS, APIs)
- You want to appear in Google AI Overviews
- Users find your product by asking AI "what's the best X for Y?"
- You publish documentation, guides, or educational content
- Your competitors are showing up in AI answers and you're not
- Building thought leadership content that AI should reference
- Launching a new product where AI-driven discovery matters
在以下场景使用 :
/geo-review- 你的产品通过AI助手被用户发现(开发工具、SaaS、API)
- 你希望出现在Google AI Overviews中
- 用户通过向AI提问“Y场景下最好的X是什么?”来寻找你的产品
- 你发布文档、指南或教育类内容
- 竞争对手出现在AI答案中,但你没有
- 打造AI应该参考的思想领导力内容
- 发布一款AI驱动发现至关重要的新产品
Why GEO Matters Now
GEO 当前的重要性
- 40% of Gen Z uses TikTok and AI chatbots instead of Google for search (Adobe 2024)
- Google AI Overviews now appear for ~30% of search queries, pushing traditional results below the fold
- Perplexity processes 100M+ queries/month, citing web sources in every answer
- ChatGPT with browsing and search is becoming a primary research tool
- AI systems don't rank links — they select and cite sources based on different signals than traditional SEO
- Being the source an AI quotes is the new "position #1"
- 40%的Z世代使用TikTok和AI聊天机器人而非Google进行搜索(Adobe 2024)
- Google AI Overviews现在出现在约30%的搜索查询结果中,将传统结果挤到页面下方
- Perplexity每月处理1亿+查询,每个答案都会引用网络来源
- 带浏览功能的ChatGPT和搜索正在成为主要研究工具
- AI系统不会对链接排名——它们基于与传统SEO不同的信号选择并引用来源
- 成为AI引用的来源是新的“排名第一”
Standards & Frameworks Referenced
参考的标准与框架
- GEO research (Georgia Tech / Princeton / IIT Delhi, 2024) — "GEO: Generative Engine Optimization"
- Google E-E-A-T — Experience, Expertise, Authoritativeness, Trustworthiness
- Schema.org — Structured data for entity understanding
- llms.txt — Emerging standard for AI crawler instructions (similar to robots.txt for LLMs)
- Retrieval-Augmented Generation (RAG) — How AI systems fetch and cite content
- GEO研究(佐治亚理工学院/普林斯顿大学/印度理工学院德里分校,2024)——《GEO:生成式引擎优化》
- Google E-E-A-T——体验、专业度、权威性、可信度
- Schema.org——用于实体理解的结构化数据标准
- llms.txt——针对AI爬虫的新兴指令标准(类似面向LLM的robots.txt)
- Retrieval-Augmented Generation (RAG)——AI系统获取并引用内容的方式
Phase Overview
阶段概述
Phase 1: EDUCATE → How AI search works differently from traditional search
Phase 2: SCOPE → Identify content types, target queries, AI visibility goals
Phase 3: ANALYZE → Content analysis + browser-based AI search validation
Phase 4: REPORT → Findings with citation gap analysis and confidence scores
Phase 5: REMEDIATE → Fix guidance + YAML regression testsPhase 1: EDUCATE → AI搜索与传统搜索的差异
Phase 2: SCOPE → 识别内容类型、目标查询、AI可见性目标
Phase 3: ANALYZE → 内容分析 + 基于浏览器的AI搜索验证
Phase 4: REPORT → 包含引用差距分析和置信度得分的结果报告
Phase 5: REMEDIATE → 修复指南 + YAML回归测试Phase 1: Educate
阶段1:教育引导
How AI search is different: Traditional search engines crawl, index, and rank pages by relevance signals (backlinks, keywords, authority). AI answer engines do something fundamentally different — they retrieve content, understand it semantically, and synthesize answers by selecting the most citation-worthy sources. Your content needs to be clear, specific, authoritative, and directly answerable to be selected.
Key insight: AI systems prefer content that makes specific, verifiable claims with supporting evidence. Vague marketing copy is ignored. Concrete statements with data, comparisons, and clear structure get cited.
AI搜索的不同之处: 传统搜索引擎抓取、索引并根据相关性信号(反向链接、关键词、权威性)对页面进行排名。AI问答引擎的运作方式完全不同——它们检索内容、进行语义理解,并通过选择最具引用价值的来源来合成答案。你的内容需要清晰、具体、权威且能直接回答问题,才会被选中。
核心洞察: AI系统偏好包含具体、可验证声明并带有支持证据的内容。模糊的营销文案会被忽略。带有数据、对比和清晰结构的具体陈述会被引用。
Phase 2: Scope
阶段2:范围界定
Gather context
收集背景信息
-
Auto-detect from codebase/content:
- Content pages (docs, blog, landing pages, about, pricing, FAQ)
- Existing structured data (JSON-LD, Schema.org)
- Content management approach (static, CMS, MDX, etc.)
- llms.txt presence
- Sitemap and content organization
- Author/expertise signals
- Publication dates and freshness signals
-
Ask the user (one at a time):
- Product type: What does your product/site do? (needed to understand AI query context)
- Target URL: Where is the content published?
- Target AI queries: What questions should AI answer with your content? (e.g., "best CI/CD tool for startups", "how to implement OAuth in Node.js")
- Competitors: Who else shows up when AI answers these queries? (optional but valuable)
- Content goals: Documentation? Thought leadership? Product discovery? All of the above?
-
Map content landscape:
- Key content pages and their purpose
- Target queries each page should satisfy
- Current AI citation status (test a few queries in ChatGPT/Perplexity)
- Content gaps vs competitors
-
从代码库/内容自动检测:
- 内容页面(文档、博客、着陆页、关于页、定价页、FAQ)
- 现有结构化数据(JSON-LD、Schema.org)
- 内容管理方式(静态页面、CMS、MDX等)
- llms.txt是否存在
- 站点地图与内容组织
- 作者/专业度信号
- 发布日期与新鲜度信号
-
向用户询问(逐一进行):
- 产品类型:你的产品/网站是做什么的?(需要了解AI查询的上下文)
- 目标URL:内容发布在哪里?
- 目标AI查询:AI应该用你的内容回答哪些问题?(例如:“初创企业最佳CI/CD工具”、“如何在Node.js中实现OAuth”)
- 竞争对手:AI回答这些查询时还会出现哪些竞品?(可选但有价值)
- 内容目标:文档?思想领导力?产品发现?以上全部?
-
绘制内容版图:
- 关键内容页面及其用途
- 每个页面应满足的目标查询
- 当前AI引用状态(在ChatGPT/Perplexity中测试几个查询)
- 与竞争对手的内容差距
Phase 3: Analyze
阶段3:分析
Open a browser session with using . Run all applicable check categories.
new_sessionrecord_evidence: true使用打开浏览器会话,设置。运行所有适用的检查类别。
new_sessionrecord_evidence: trueCategory A: Content Citation-Worthiness (CITE)
类别A:内容可引用性(CITE)
| Check ID | Check | Principle | Method |
|---|---|---|---|
| CITE-01 | Content contains specific, verifiable claims | GEO research | Scan pages for concrete statements with data/numbers |
| CITE-02 | Statistics and original data are present | GEO research | Check for unique numbers, benchmarks, research findings |
| CITE-03 | Content directly answers target queries | RAG retrieval | Match content against target queries — does it contain direct answers? |
| CITE-04 | Claims have supporting evidence or citations | E-E-A-T | Check for source references, links, data attribution |
| CITE-05 | Content is specific (not generic/vague) | GEO research | Analyze content for specificity vs marketing fluff |
| CITE-06 | Comparison content exists (vs alternatives) | AI preference | Check for "X vs Y" or comparison tables that AI can cite |
| CITE-07 | Content has clear, quotable summary sentences | Citation format | Check if key paragraphs start with citable claims |
| CITE-08 | Unique perspective or data (not regurgitated) | E-E-A-T | Assess originality — does this add something AI can't already synthesize? |
| CITE-09 | Content demonstrates first-hand experience | E-E-A-T (Experience) | Check for case studies, personal experience, real examples |
| CITE-10 | Technical accuracy and depth | E-E-A-T (Expertise) | Assess whether content goes beyond surface level |
Browser validation: Navigate to content pages. Extract text content. Analyze for claim density, statistics, quotable statements. Compare against target queries for direct answer matching.
| 检查ID | 检查项 | 原则 | 方法 |
|---|---|---|---|
| CITE-01 | 内容包含具体、可验证的声明 | GEO研究 | 扫描页面,查找带有数据/数字的具体陈述 |
| CITE-02 | 包含统计数据和原创数据 | GEO研究 | 检查是否有独特的数字、基准、研究结果 |
| CITE-03 | 内容直接回答目标查询 | RAG检索 | 将内容与目标查询匹配——是否包含直接答案? |
| CITE-04 | 声明带有支持证据或引用 | E-E-A-T | 检查是否有来源参考、链接、数据归因 |
| CITE-05 | 内容具体(非通用/模糊) | GEO研究 | 分析内容的具体性与营销套话占比 |
| CITE-06 | 存在对比内容(与替代方案) | AI偏好 | 检查是否有“X vs Y”或AI可引用的对比表格 |
| CITE-07 | 内容有清晰、可引用的总结句 | 引用格式 | 检查关键段落是否以可引用的声明开头 |
| CITE-08 | 独特视角或数据(非重复内容) | E-E-A-T | 评估原创性——是否提供了AI无法自行合成的内容? |
| CITE-09 | 内容展示第一手经验 | E-E-A-T(体验) | 检查是否有案例研究、个人经验、真实示例 |
| CITE-10 | 技术准确性与深度 | E-E-A-T(专业度) | 评估内容是否超越表面层次 |
浏览器验证: 导航到内容页面。提取文本内容。分析声明密度、统计数据、可引用陈述。与目标查询对比,检查是否匹配直接答案。
Category B: Content Structure for AI Retrieval (STRUCT)
类别B:AI检索的内容结构(STRUCT)
| Check ID | Check | Principle | Method |
|---|---|---|---|
| STRUCT-01 | Clear heading hierarchy maps to questions | RAG chunking | Check if H2/H3 headings are question-shaped or topic-clear |
| STRUCT-02 | FAQ sections with direct Q&A format | AI preference | Check for FAQ sections, question-answer pairs |
| STRUCT-03 | Definition/explanation paragraphs lead with the answer | Retrieval | Check if paragraphs front-load the key claim (inverted pyramid) |
| STRUCT-04 | Tables and structured comparisons present | AI preference | Check for HTML tables with clear headers |
| STRUCT-05 | Content is chunked into digestible sections (300-500 words) | RAG chunking | Measure section lengths between headings |
| STRUCT-06 | Lists used for multi-point information | AI preference | Check for ordered/unordered lists for multi-step or multi-item content |
| STRUCT-07 | Code examples are complete and runnable (for technical content) | Developer experience | Check code blocks for completeness and language tags |
| STRUCT-08 | TL;DR or summary at top of long content | Retrieval | Check for executive summary or key takeaways section |
Browser validation: Extract heading structure, count FAQ patterns, measure section lengths, check for tables and lists via DOM inspection.
| 检查ID | 检查项 | 原则 | 方法 |
|---|---|---|---|
| STRUCT-01 | 清晰的标题层级对应问题 | RAG分块 | 检查H2/H3标题是否为问题形式或主题明确 |
| STRUCT-02 | 带有直接问答格式的FAQ板块 | AI偏好 | 检查是否有FAQ板块、问答对 |
| STRUCT-03 | 定义/解释段落先给出答案 | 检索需求 | 检查段落是否前置核心声明(倒金字塔结构) |
| STRUCT-04 | 存在表格和结构化对比 | AI偏好 | 检查是否有带清晰表头的HTML表格 |
| STRUCT-05 | 内容被拆分为易读的小节(300-500字) | RAG分块 | 测量标题之间的小节长度 |
| STRUCT-06 | 使用列表呈现多点信息 | AI偏好 | 检查是否使用有序/无序列表展示多步骤或多项目内容 |
| STRUCT-07 | 代码示例完整且可运行(针对技术内容) | 开发者体验 | 检查代码块的完整性和语言标签 |
| STRUCT-08 | 长内容顶部有TL;DR或摘要 | 检索需求 | 检查是否有执行摘要或关键要点板块 |
浏览器验证: 提取标题结构,统计FAQ模式,测量小节长度,通过DOM检查表格和列表。
Category C: Authority & Trust Signals (AUTH)
类别C:权威性与信任信号(AUTH)
| Check ID | Check | Principle | Method |
|---|---|---|---|
| AUTH-01 | Author information present (name, bio, credentials) | E-E-A-T | Check for author bylines, about sections |
| AUTH-02 | Organization/brand identity clear | Entity recognition | Check for About page, consistent branding |
| AUTH-03 | Publication and update dates visible | Freshness | Check for date metadata on content pages |
| AUTH-04 | Sources and references cited | E-E-A-T | Check for outbound links to authoritative sources |
| AUTH-05 | Testimonials/social proof present | Trust | Check for customer quotes, logos, case studies |
| AUTH-06 | Professional contact information available | Trust | Check for contact page, physical address, support channels |
| AUTH-07 | Content recency (updated within last 12 months) | Freshness | Check publish/update dates |
| AUTH-08 | Domain authority indicators (established site) | E-E-A-T | Check site age, about page depth, team page |
Browser validation: Navigate to content pages, about page, author pages. Extract dates, author info, citation links.
| 检查ID | 检查项 | 原则 | 方法 |
|---|---|---|---|
| AUTH-01 | 存在作者信息(姓名、简介、资质) | E-E-A-T | 检查是否有作者署名、关于板块 |
| AUTH-02 | 组织/品牌身份清晰 | 实体识别 | 检查是否有关于页、一致的品牌标识 |
| AUTH-03 | 可见的发布和更新日期 | 新鲜度 | 检查内容页面的日期元数据 |
| AUTH-04 | 引用来源和参考文献 | E-E-A-T | 检查是否有指向权威来源的出站链接 |
| AUTH-05 | 存在推荐语/社交证明 | 可信度 | 检查是否有客户评价、品牌标志、案例研究 |
| AUTH-06 | 提供专业联系信息 | 可信度 | 检查是否有联系页、物理地址、支持渠道 |
| AUTH-07 | 内容时效性(过去12个月内更新) | 新鲜度 | 检查发布/更新日期 |
| AUTH-08 | 域名权威性指标(成熟站点) | E-E-A-T | 检查站点年龄、关于页深度、团队页 |
浏览器验证: 导航到内容页面、关于页、作者页。提取日期、作者信息、引用链接。
Category D: Technical AI Discoverability (TECH)
类别D:AI技术可发现性(TECH)
| Check ID | Check | Principle | Method |
|---|---|---|---|
| TECH-01 | llms.txt present at site root | AI crawler standard | Fetch /llms.txt, check format and content |
| TECH-02 | llms-full.txt with detailed content (if applicable) | AI crawler standard | Fetch /llms-full.txt |
| TECH-03 | JSON-LD structured data with rich entity info | Schema.org | Check for Organization, Product, Article, FAQ schema |
| TECH-04 | Content accessible without JavaScript | RAG crawling | Disable JS, check if content renders |
| TECH-05 | Clean, semantic HTML (not framework soup) | Crawlability | Check for meaningful tags vs div-heavy DOM |
| TECH-06 | robots.txt allows AI crawlers | Discoverability | Check for GPTBot, ClaudeBot, PerplexityBot, Bingbot rules |
| TECH-07 | Sitemap includes content pages with lastmod | Discoverability | Check sitemap for content pages and dates |
| TECH-08 | Open Graph tags help AI understand content | Social + AI | Check OG tags for accurate content description |
| TECH-09 | API documentation is machine-readable (if applicable) | Developer GEO | Check for OpenAPI spec, API reference format |
| TECH-10 | Content is not behind authentication walls | RAG access | Verify key content is publicly accessible |
Browser validation: Fetch llms.txt, check robots.txt for AI bot rules, verify SSR content, inspect structured data.
| 检查ID | 检查项 | 原则 | 方法 |
|---|---|---|---|
| TECH-01 | 站点根目录存在llms.txt | AI爬虫标准 | 获取/llms.txt,检查格式和内容 |
| TECH-02 | 存在包含详细内容的llms-full.txt(如适用) | AI爬虫标准 | 获取/llms-full.txt |
| TECH-03 | 带有丰富实体信息的JSON-LD结构化数据 | Schema.org | 检查是否有Organization、Product、Article、FAQ schema |
| TECH-04 | 无需JavaScript即可访问内容 | RAG爬取 | 禁用JS,检查内容是否可渲染 |
| TECH-05 | 简洁、语义化的HTML(非框架冗余代码) | 可爬取性 | 检查是否有有意义的标签而非大量div的DOM |
| TECH-06 | robots.txt允许AI爬虫 | 可发现性 | 检查GPTBot、ClaudeBot、PerplexityBot、Bingbot的规则 |
| TECH-07 | 站点地图包含带lastmod的内容页面 | 可发现性 | 检查站点地图中的内容页面和日期 |
| TECH-08 | Open Graph标签帮助AI理解内容 | 社交+AI | 检查OG标签的内容描述是否准确 |
| TECH-09 | API文档可被机器读取(如适用) | 开发者GEO | 检查是否有OpenAPI规范、API参考格式 |
| TECH-10 | 内容无需认证即可访问 | RAG访问 | 验证关键内容是否公开可访问 |
浏览器验证: 获取llms.txt,检查robots.txt中的AI机器人规则,验证SSR内容,检查结构化数据。
Category E: Entity & Brand Clarity (ENTITY)
类别E:实体与品牌清晰度(ENTITY)
| Check ID | Check | Principle | Method |
|---|---|---|---|
| ENTITY-01 | Product/brand name is consistently used | Entity recognition | Check name consistency across pages |
| ENTITY-02 | Clear product category declaration | AI classification | Check if content states "X is a [category]" explicitly |
| ENTITY-03 | Key features/differentiators stated clearly | AI comparison | Check for feature lists, unique value propositions |
| ENTITY-04 | Use case descriptions are specific | AI recommendation | Check for "best for [specific use case]" patterns |
| ENTITY-05 | Pricing/tier information is structured | AI recommendation | Check pricing page for clear, structured plans |
| ENTITY-06 | Integration/compatibility information present | AI recommendation | Check for "works with X" / integration pages |
| ENTITY-07 | Competitor differentiation is factual | AI comparison | Check comparison content for factual (not just marketing) claims |
| ENTITY-08 | Industry/vertical targeting is explicit | AI classification | Check if content targets specific industries/roles |
Browser validation: Navigate key pages and extract product positioning, feature lists, use cases, pricing structure. Check for entity-clear statements.
| 检查ID | 检查项 | 原则 | 方法 |
|---|---|---|---|
| ENTITY-01 | 产品/品牌名称使用一致 | 实体识别 | 检查跨页面的名称一致性 |
| ENTITY-02 | 清晰声明产品类别 | AI分类 | 检查内容是否明确说明“X是[类别]” |
| ENTITY-03 | 清晰陈述核心功能/差异化点 | AI对比 | 检查是否有功能列表、独特价值主张 |
| ENTITY-04 | 使用场景描述具体 | AI推荐 | 检查是否有“适用于[特定场景]”的模式 |
| ENTITY-05 | 定价/层级信息结构化 | AI推荐 | 检查定价页是否有清晰、结构化的方案 |
| ENTITY-06 | 存在集成/兼容性信息 | AI推荐 | 检查是否有“与X兼容”/集成页面 |
| ENTITY-07 | 竞品差异化基于事实 | AI对比 | 检查对比内容是否基于事实(而非仅营销话术) |
| ENTITY-08 | 明确针对行业/垂直领域 | AI分类 | 检查内容是否针对特定行业/角色 |
浏览器验证: 导航关键页面,提取产品定位、功能列表、使用场景、定价结构。检查实体清晰的陈述。
Category F: AI Citation Testing (TEST)
类别F:AI引用测试(TEST)
This category is unique to GEO — it tests actual AI visibility.
| Check ID | Check | Method |
|---|---|---|
| TEST-01 | Test target queries in Perplexity | Navigate to perplexity.ai, search target queries, check if your site is cited |
| TEST-02 | Test target queries in ChatGPT (if browsing available) | Search via ChatGPT, check citations |
| TEST-03 | Test target queries in Google (check AI Overview) | Google search, check if AI Overview cites your content |
| TEST-04 | Compare citation frequency vs competitors | Count citations for you vs top competitors across queries |
| TEST-05 | Analyze what content IS being cited (from competitors) | Study cited content format, structure, claims |
Browser validation: Use to navigate to Perplexity and Google. Search target queries. Screenshot results. Check for citations to the user's domain. This provides real-world evidence of current AI visibility.
new_sessionImportant: TEST category results are the ground truth — they show whether your content is actually being cited, regardless of what the other categories suggest.
此类别为GEO独有——测试实际AI可见性。
| 检查ID | 检查项 | 方法 |
|---|---|---|
| TEST-01 | 在Perplexity中测试目标查询 | 导航到perplexity.ai,搜索目标查询,检查你的站点是否被引用 |
| TEST-02 | 在ChatGPT中测试目标查询(若有浏览功能) | 通过ChatGPT搜索,检查引用情况 |
| TEST-03 | 在Google中测试目标查询(检查AI Overview) | Google搜索,检查AI Overview是否引用你的内容 |
| TEST-04 | 对比与竞争对手的引用频率 | 统计你与顶级竞品在各查询中的引用次数 |
| TEST-05 | 分析竞品被引用的内容特征 | 研究被引用内容的格式、结构、声明 |
浏览器验证: 使用导航到Perplexity和Google。搜索目标查询。截图结果。检查是否引用用户的域名。这提供了当前AI可见性的真实证据。
new_session重要提示: TEST类别的结果是客观事实——无论其他类别结果如何,它都能显示你的内容是否真正被引用。
Phase 4: Report
阶段4:报告
Generate a structured report saved to :
shiplight/reports/geo-review-{date}.mdmarkdown
undefined生成结构化报告并保存到:
shiplight/reports/geo-review-{date}.mdmarkdown
undefinedGEO Review Report
GEO 评估报告
Date: {date}
URL: {url}
Product type: {description}
Target AI queries tested: {list}
日期: {date}
URL: {url}
产品类型: {description}
测试的目标AI查询: {list}
Overall GEO Score: {X}/10 | Confidence: {X}%
总体GEO得分:{X}/10 | 置信度:{X}%
Score Breakdown
得分细分
| Category | Score | Findings |
|---|---|---|
| Citation-Worthiness (CITE) | 5/10 | 2 high, 2 medium |
| Content Structure (STRUCT) | 6/10 | 1 high, 2 medium |
| Authority Signals (AUTH) | 7/10 | 1 medium |
| Technical Discoverability (TECH) | 4/10 | 1 critical, 2 high |
| Entity Clarity (ENTITY) | 5/10 | 2 high |
| AI Citation Testing (TEST) | 3/10 | Not cited in 4/5 target queries |
| 类别 | 得分 | 发现 |
|---|---|---|
| 可引用性(CITE) | 5/10 | 2个高优先级问题,2个中优先级问题 |
| 内容结构(STRUCT) | 6/10 | 1个高优先级问题,2个中优先级问题 |
| 权威性信号(AUTH) | 7/10 | 1个中优先级问题 |
| 技术可发现性(TECH) | 4/10 | 1个关键问题,2个高优先级问题 |
| 实体清晰度(ENTITY) | 5/10 | 2个高优先级问题 |
| AI引用测试(TEST) | 3/10 | 5个目标查询中有4个未被引用 |
AI Citation Status
AI引用状态
| Target Query | Perplexity | Google AI Overview | Cited? | Competitor Cited? |
|---|---|---|---|---|
| "best X for Y" | Not cited | Not in overview | ❌ | CompetitorA: ✅ |
| "how to do Z" | Cited (#3 source) | Cited | ✅ | CompetitorB: ✅ |
| ... |
| 目标查询 | Perplexity | Google AI Overview | 是否被引用? | 竞品是否被引用? |
|---|---|---|---|---|
| "best X for Y" | 未被引用 | 未出现在Overview中 | ❌ | CompetitorA: ✅ |
| "how to do Z" | 被引用(第3来源) | 被引用 | ✅ | CompetitorB: ✅ |
| ... |
Citation Gap Analysis
引用差距分析
What competitors' cited content has that yours doesn't:
- Specific performance benchmarks (CompetitorA cites "40% faster than...")
- Comparison tables (CompetitorB has detailed feature matrices)
- Direct answer paragraphs (CompetitorA leads sections with the conclusion)
竞品被引用的内容具备而你没有的特征:
- 具体性能基准(CompetitorA引用“比...快40%”)
- 对比表格(CompetitorB有详细的功能矩阵)
- 直接回答段落(CompetitorA小节开头即给出结论)
Findings
发现
(structured findings with evidence and priority)
undefined(带有证据和优先级的结构化结果)
undefinedConfidence Scoring
置信度评分
- 90-100%: Verified via live AI search — content is/isn't cited (TEST category)
- 70-89%: Strong structural evidence — content has/lacks citation-worthy patterns
- 50-69%: Heuristic assessment of content quality signals
- Below 50%: Don't report
- 90-100%:通过实时AI搜索验证——内容已被/未被引用(TEST类别)
- 70-89%:强有力的结构证据——内容具备/缺乏可引用模式
- 50-69%:对内容质量信号的启发式评估
- 低于50%:不生成报告
Phase 5: Remediate
阶段5:修复优化
1. Fix guidance (example)
1. 修复指南(示例)
markdown
undefinedmarkdown
undefinedCITE-01: Landing page lacks specific, verifiable claims
CITE-01:着陆页缺乏具体、可验证的声明
Impact: AI systems skip vague marketing copy — your landing page is invisible to AI answers
Current: "We're the fastest platform for modern teams"
Fix: Add specific, citable claims:
- "Deploys complete in 47 seconds on average (based on 10,000 deployments in Q4 2025)"
- "Used by 2,300 companies including [notable names]"
- "Reduces CI/CD pipeline time by 62% compared to Jenkins (internal benchmark, Jan 2026)" Principle: AI cites facts, not adjectives. Every claim should be verifiable.
```markdown影响: AI系统会跳过模糊的营销文案——你的着陆页在AI答案中不可见
当前状态: "我们是面向现代团队的最快平台"
修复方案: 添加具体、可引用的声明:
- "平均部署完成时间为47秒(基于2025年Q4的10,000次部署数据)"
- "被2,300家公司使用,包括[知名企业]"
- "与Jenkins相比,将CI/CD流水线时间缩短62%(内部基准数据,2026年1月)" 原则: AI引用事实,而非形容词。每个声明都应可验证。
```markdownTECH-01: No llms.txt present
TECH-01:不存在llms.txt
Impact: AI crawlers have no guidance on how to understand your site
Fix: Create /llms.txt at site root:
影响: AI爬虫没有理解你的站点的指导规则
修复方案: 在站点根目录创建/llms.txt:
[Your Product Name]
[你的产品名称]
One-sentence description of what your product does.
一句话描述你的产品功能。
Docs
文档
- Getting Started: How to set up and configure [Product]
- API Reference: Complete API documentation
- Guides: Step-by-step tutorials
Key Pages
关键页面
2. YAML regression tests
2. YAML回归测试
yaml
- name: tech-01-llms-txt-present
description: Verify llms.txt exists and is properly formatted
severity: high
standard: llms-txt-standard
steps:
- URL: /llms.txt
- VERIFY: The page loads successfully and contains structured information about the site
- CODE: |
const content = await page.textContent('body');
if (!content || content.trim().length < 50) {
throw new Error('llms.txt is missing or too short');
}
if (!content.includes('#')) {
throw new Error('llms.txt should use markdown heading structure');
}
console.log(`llms.txt found (${content.length} chars)`);
- name: tech-06-ai-crawlers-allowed
description: Verify robots.txt allows AI search crawlers
severity: high
standard: AI-Discoverability
steps:
- URL: /robots.txt
- CODE: |
const content = await page.textContent('body');
const blockedBots = ['GPTBot', 'ClaudeBot', 'PerplexityBot', 'Google-Extended'];
const blocked = blockedBots.filter(bot => {
const pattern = new RegExp(`User-agent:\\s*${bot}[\\s\\S]*?Disallow:\\s*/`, 'i');
return pattern.test(content);
});
if (blocked.length > 0) {
throw new Error(`AI crawlers blocked in robots.txt: ${blocked.join(', ')}`);
}
console.log('All major AI crawlers are allowed');
- VERIFY: robots.txt does not block major AI search engine crawlers
- name: cite-01-specific-claims-present
description: Verify key pages contain specific, citable claims with data
severity: high
standard: GEO-Citation-Worthiness
steps:
- URL: /
- CODE: |
const text = await page.textContent('main') || await page.textContent('body');
// Check for specific numbers/statistics
const hasNumbers = /\d+[%xX]|\$[\d,.]+|\d{1,3}(,\d{3})+|\d+\s*(users|customers|companies|teams|downloads)/i.test(text);
if (!hasNumbers) {
throw new Error('Landing page lacks specific statistics or data points that AI can cite');
}
console.log('Found specific, citable claims with data');
- VERIFY: Landing page contains specific statistics, benchmarks, or verifiable data pointsSave all YAML tests to .
shiplight/tests/geo-review.test.yamlyaml
- name: tech-01-llms-txt-present
description: Verify llms.txt exists and is properly formatted
severity: high
standard: llms-txt-standard
steps:
- URL: /llms.txt
- VERIFY: The page loads successfully and contains structured information about the site
- CODE: |
const content = await page.textContent('body');
if (!content || content.trim().length < 50) {
throw new Error('llms.txt is missing or too short');
}
if (!content.includes('#')) {
throw new Error('llms.txt should use markdown heading structure');
}
console.log(`llms.txt found (${content.length} chars)`);
- name: tech-06-ai-crawlers-allowed
description: Verify robots.txt allows AI search crawlers
severity: high
standard: AI-Discoverability
steps:
- URL: /robots.txt
- CODE: |
const content = await page.textContent('body');
const blockedBots = ['GPTBot', 'ClaudeBot', 'PerplexityBot', 'Google-Extended'];
const blocked = blockedBots.filter(bot => {
const pattern = new RegExp(`User-agent:\\s*${bot}[\\s\\S]*?Disallow:\\s*/`, 'i');
return pattern.test(content);
});
if (blocked.length > 0) {
throw new Error(`AI crawlers blocked in robots.txt: ${blocked.join(', ')}`);
}
console.log('All major AI crawlers are allowed');
- VERIFY: robots.txt does not block major AI search engine crawlers
- name: cite-01-specific-claims-present
description: Verify key pages contain specific, citable claims with data
severity: high
standard: GEO-Citation-Worthiness
steps:
- URL: /
- CODE: |
const text = await page.textContent('main') || await page.textContent('body');
// Check for specific numbers/statistics
const hasNumbers = /\d+[%xX]|\$[\d,.]+|\d{1,3}(,\d{3})+|\d+\s*(users|customers|companies|teams|downloads)/i.test(text);
if (!hasNumbers) {
throw new Error('Landing page lacks specific statistics or data points that AI can cite');
}
console.log('Found specific, citable claims with data');
- VERIFY: Landing page contains specific statistics, benchmarks, or verifiable data points将所有YAML测试保存到。
shiplight/tests/geo-review.test.yamlDepth Levels
深度级别
- : llms.txt check + robots.txt AI crawler check + landing page claim analysis. ~2 minutes.
--quick - default: All content categories + 3 target query tests in Perplexity. ~10-15 minutes.
- : All categories + full AI citation testing across multiple engines + competitor citation analysis + content gap recommendations. ~25-40 minutes.
--thorough
- :llms.txt检查 + robots.txt AI爬虫检查 + 着陆页声明分析。约2分钟。
--quick - 默认:所有内容类别 + 在Perplexity中测试3个目标查询。约10-15分钟。
- :所有类别 + 多引擎完整AI引用测试 + 竞品引用分析 + 内容差距建议。约25-40分钟。
--thorough
Tips
提示
- The TEST category (live AI search testing) is the most valuable — it shows ground truth, not theory
- Perplexity is the best testing ground because it always shows citations
- llms.txt is emerging but increasingly adopted — it's low effort, high signal
- AI systems update their knowledge at different speeds — changes may take weeks to reflect in citations
- Focus on content that answers specific questions, not brand awareness content
- The #1 GEO principle: AI cites facts, not adjectives — replace every vague claim with a specific one
- Close session with and use
close_sessionfor evidencegenerate_html_report
- TEST类别(实时AI搜索测试)最有价值——它展示客观事实,而非理论
- Perplexity是最佳测试平台,因为它始终显示引用来源
- llms.txt是新兴标准但被越来越多采用——投入少,信号价值高
- AI系统更新知识的速度不同——更改可能需要数周才能反映在引用中
- 专注于回答具体问题的内容,而非品牌宣传内容
- GEO核心原则:AI引用事实,而非形容词——用具体声明替换所有模糊表述
- 使用关闭会话,并用
close_session生成证据报告generate_html_report