docs-seeker

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Documentation Discovery & Analysis

文档发现与分析

Overview

概述

Intelligent discovery and analysis of technical documentation through multiple strategies:
  1. llms.txt-first: Search for standardized AI-friendly documentation
  2. Repository analysis: Use Repomix to analyze GitHub repositories
  3. Parallel exploration: Deploy multiple Explorer agents for comprehensive coverage
  4. Fallback research: Use Researcher agents when other methods unavailable
通过多种策略智能发现和分析技术文档:
  1. llms.txt优先:搜索标准化的AI友好型文档
  2. 仓库分析:使用Repomix分析GitHub仓库
  3. 并行探索:部署多个Explorer agent以实现全面覆盖
  4. 备选研究:当其他方法不可用时,使用Researcher agent

Core Workflow

核心工作流

Phase 1: Initial Discovery

阶段1:初始发现

  1. Identify target
    • Extract library/framework name from user request
    • Note version requirements (default: latest)
    • Clarify scope if ambiguous
    • Identify if target is GitHub repository or website
  2. Search for llms.txt (PRIORITIZE context7.com)
    First: Try context7.com patterns
    For GitHub repositories:
    Pattern: https://context7.com/{org}/{repo}/llms.txt
    Examples:
    - https://github.com/imagick/imagick → https://context7.com/imagick/imagick/llms.txt
    - https://github.com/vercel/next.js → https://context7.com/vercel/next.js/llms.txt
    - https://github.com/better-auth/better-auth → https://context7.com/better-auth/better-auth/llms.txt
    For websites:
    Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
    Examples:
    - https://docs.imgix.com/ → https://context7.com/websites/imgix/llms.txt
    - https://docs.byteplus.com/en/docs/ModelArk/ → https://context7.com/websites/byteplus_en_modelark/llms.txt
    - https://docs.haystack.deepset.ai/docs → https://context7.com/websites/haystack_deepset_ai/llms.txt
    - https://ffmpeg.org/doxygen/8.0/ → https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt
    Topic-specific searches (when user asks about specific feature):
    Pattern: https://context7.com/{path}/llms.txt?topic={query}
    Examples:
    - https://context7.com/shadcn-ui/ui/llms.txt?topic=date
    - https://context7.com/shadcn-ui/ui/llms.txt?topic=button
    - https://context7.com/vercel/next.js/llms.txt?topic=cache
    - https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress
    Fallback: Traditional llms.txt search
    WebSearch: "[library name] llms.txt site:[docs domain]"
    Common patterns:
    • https://docs.[library].com/llms.txt
    • https://[library].dev/llms.txt
    • https://[library].io/llms.txt
    → Found? Proceed to Phase 2 → Not found? Proceed to Phase 3
  1. 确定目标
    • 从用户请求中提取库/框架名称
    • 记录版本要求(默认:最新版)
    • 若存在歧义则明确范围
    • 确定目标是GitHub仓库还是网站
  2. 搜索llms.txt(优先使用context7.com)
    第一步:尝试context7.com的模式
    针对GitHub仓库:
    Pattern: https://context7.com/{org}/{repo}/llms.txt
    Examples:
    - https://github.com/imagick/imagick → https://context7.com/imagick/imagick/llms.txt
    - https://github.com/vercel/next.js → https://context7.com/vercel/next.js/llms.txt
    - https://github.com/better-auth/better-auth → https://context7.com/better-auth/better-auth/llms.txt
    针对网站:
    Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
    Examples:
    - https://docs.imgix.com/ → https://context7.com/websites/imgix/llms.txt
    - https://docs.byteplus.com/en/docs/ModelArk/ → https://context7.com/websites/byteplus_en_modelark/llms.txt
    - https://docs.haystack.deepset.ai/docs → https://context7.com/websites/haystack_deepset_ai/llms.txt
    - https://ffmpeg.org/doxygen/8.0/ → https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt
    特定主题搜索(当用户询问特定功能时):
    Pattern: https://context7.com/{path}/llms.txt?topic={query}
    Examples:
    - https://context7.com/shadcn-ui/ui/llms.txt?topic=date
    - https://context7.com/shadcn-ui/ui/llms.txt?topic=button
    - https://context7.com/vercel/next.js/llms.txt?topic=cache
    - https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress
    备选:传统llms.txt搜索
    WebSearch: "[library name] llms.txt site:[docs domain]"
    常见模式:
    • https://docs.[library].com/llms.txt
    • https://[library].dev/llms.txt
    • https://[library].io/llms.txt
    → 找到?进入阶段2 → 未找到?进入阶段3

Phase 2: llms.txt Processing

阶段2:llms.txt处理

Single URL:
  • WebFetch to retrieve content
  • Extract and present information
Multiple URLs (3+):
  • CRITICAL: Launch multiple Explorer agents in parallel
  • One agent per major documentation section (max 5 in first batch)
  • Each agent reads assigned URLs
  • Aggregate findings into consolidated report
Example:
Launch 3 Explorer agents simultaneously:
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md
单个URL:
  • 使用WebFetch获取内容
  • 提取并展示信息
多个URL(3个及以上):
  • 关键操作:并行启动多个Explorer agent
  • 每个agent负责一个主要文档章节(第一批最多5个)
  • 每个agent读取分配的URL
  • 将结果汇总为整合报告
示例:
同时启动3个Explorer agent:
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md

Phase 3: Repository Analysis

阶段3:仓库分析

When llms.txt not found:
  1. Find GitHub repository via WebSearch
  2. Use Repomix to pack repository:
    bash
    npm install -g repomix  # if needed
    git clone [repo-url] /tmp/docs-analysis
    cd /tmp/docs-analysis
    repomix --output repomix-output.xml
  3. Read repomix-output.xml and extract documentation
Repomix benefits:
  • Entire repository in single AI-friendly file
  • Preserves directory structure
  • Optimized for AI consumption
当未找到llms.txt时:
  1. 通过WebSearch找到GitHub仓库
  2. 使用Repomix打包仓库:
    bash
    npm install -g repomix  # if needed
    git clone [repo-url] /tmp/docs-analysis
    cd /tmp/docs-analysis
    repomix --output repomix-output.xml
  3. 读取repomix-output.xml并提取文档
Repomix优势:
  • 将整个仓库打包为单个AI友好型文件
  • 保留目录结构
  • 针对AI处理进行优化

Phase 4: Fallback Research

阶段4:备选研究

When no GitHub repository exists:
  • Launch multiple Researcher agents in parallel
  • Focus areas: official docs, tutorials, API references, community guides
  • Aggregate findings into consolidated report
当不存在GitHub仓库时:
  • 并行启动多个Researcher agent
  • 重点关注领域:官方文档、教程、API参考、社区指南
  • 将结果汇总为整合报告

Agent Distribution Guidelines

Agent分配指南

  • 1-3 URLs: Single Explorer agent
  • 4-10 URLs: 3-5 Explorer agents (2-3 URLs each)
  • 11+ URLs: 5-7 Explorer agents (prioritize most relevant)
  • 1-3个URL:单个Explorer agent
  • 4-10个URL:3-5个Explorer agent(每个负责2-3个URL)
  • 11个及以上URL:5-7个Explorer agent(优先处理最相关的内容)

Version Handling

版本处理

Latest (default):
  • Search without version specifier
  • Use current documentation paths
Specific version:
  • Include version in search:
    [library] v[version] llms.txt
  • Check versioned paths:
    /v[version]/llms.txt
  • For repositories: checkout specific tag/branch
最新版(默认):
  • 搜索时不指定版本
  • 使用当前文档路径
特定版本:
  • 在搜索中包含版本:
    [library] v[version] llms.txt
  • 检查带版本的路径:
    /v[version]/llms.txt
  • 针对仓库:切换到特定标签/分支

Output Format

输出格式

markdown
undefined
markdown
undefined

Documentation for [Library] [Version]

[库名] [版本] 文档

Source

来源

  • Method: [llms.txt / Repository / Research]
  • URLs: [list of sources]
  • Date accessed: [current date]
  • 方法:[llms.txt / 仓库分析 / 备选研究]
  • URL:[来源列表]
  • 访问日期:[当前日期]

Key Information

关键信息

[Extracted relevant information organized by topic]
[按主题整理的提取信息]

Additional Resources

附加资源

[Related links, examples, references]
[相关链接、示例、参考资料]

Notes

说明

[Any limitations, missing information, or caveats]
undefined
[任何限制、缺失信息或注意事项]
undefined

Quick Reference

快速参考

Tool selection:
  • WebSearch → Find llms.txt URLs, GitHub repositories
  • WebFetch → Read single documentation pages
  • Task (Explore) → Multiple URLs, parallel exploration
  • Task (Researcher) → Scattered documentation, diverse sources
  • Repomix → Complete codebase analysis
Popular llms.txt locations (try context7.com first):
Fallback to official sites if context7.com unavailable:
工具选择:
  • WebSearch → 查找llms.txt URL、GitHub仓库
  • WebFetch → 读取单个文档页面
  • Task (Explore) → 多个URL、并行探索
  • Task (Researcher) → 分散的文档、多样化来源
  • Repomix → 完整代码库分析
常用llms.txt位置(优先尝试context7.com):
当context7.com不可用时,备选官方站点:

Error Handling

错误处理

  • llms.txt not accessible → Try alternative domains → Repository analysis
  • Repository not found → Search official website → Use Researcher agents
  • Repomix fails → Try /docs directory only → Manual exploration
  • Multiple conflicting sources → Prioritize official → Note versions
  • llms.txt无法访问 → 尝试其他域名 → 仓库分析
  • 仓库未找到 → 搜索官方网站 → 使用Researcher agent
  • Repomix执行失败 → 仅尝试/docs目录 → 手动探索
  • 多个冲突来源 → 优先使用官方内容 → 记录版本差异

Key Principles

核心原则

  1. Prioritize context7.com for llms.txt — Most comprehensive and up-to-date aggregator
  2. Use topic parameters when applicable — Enables targeted searches with ?topic=...
  3. Use parallel agents aggressively — Faster results, better coverage
  4. Verify official sources as fallback — Use when context7.com unavailable
  5. Report methodology — Tell user which approach was used
  6. Handle versions explicitly — Don't assume latest
  1. 优先使用context7.com获取llms.txt —— 最全面且最新的聚合平台
  2. 适用时使用主题参数 —— 通过?topic=...实现定向搜索
  3. 积极使用并行Agent —— 更快获取结果,覆盖范围更广
  4. 以官方来源作为备选 —— 当context7.com不可用时使用
  5. 报告方法 —— 告知用户所使用的方法
  6. 明确处理版本 —— 不要默认使用最新版

Detailed Documentation

详细文档

For comprehensive guides, examples, and best practices:
Workflows:
  • WORKFLOWS.md — Detailed workflow examples and strategies
Reference guides:
  • Tool Selection — Complete guide to choosing and using tools
  • Documentation Sources — Common sources and patterns across ecosystems
  • Error Handling — Troubleshooting and resolution strategies
  • Best Practices — 8 essential principles for effective discovery
  • Performance — Optimization techniques and benchmarks
  • Limitations — Boundaries and success criteria
如需全面指南、示例和最佳实践:
工作流:
  • WORKFLOWS.md —— 详细的工作流示例和策略
参考指南:
  • Tool Selection —— 工具选择与使用完整指南
  • Documentation Sources —— 跨生态系统的常见来源和模式
  • Error Handling —— 故障排除与解决策略
  • Best Practices —— 有效文档发现的8项核心原则
  • Performance —— 优化技术与基准测试
  • Limitations —— 适用边界与成功标准