docs-seeker

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Documentation Discovery & Analysis

文档发现与分析

Overview

概述

Intelligent discovery and analysis of technical documentation through multiple strategies:

llms.txt-first: Search for standardized AI-friendly documentation
Repository analysis: Use Repomix to analyze GitHub repositories
Parallel exploration: Deploy multiple Explorer agents for comprehensive coverage
Fallback research: Use Researcher agents when other methods unavailable

通过多种策略智能发现和分析技术文档：

llms.txt优先：搜索标准化的AI友好型文档
仓库分析：使用Repomix分析GitHub仓库
并行探索：部署多个Explorer agent以实现全面覆盖
备选研究：当其他方法不可用时，使用Researcher agent

Core Workflow

核心工作流

Phase 1: Initial Discovery

阶段1：初始发现

Identify target
- Extract library/framework name from user request
- Note version requirements (default: latest)
- Clarify scope if ambiguous
- Identify if target is GitHub repository or website

Search for llms.txt (PRIORITIZE context7.com)

First: Try context7.com patterns

For GitHub repositories:

Pattern: https://context7.com/{org}/{repo}/llms.txt
Examples:
- https://github.com/imagick/imagick → https://context7.com/imagick/imagick/llms.txt
- https://github.com/vercel/next.js → https://context7.com/vercel/next.js/llms.txt
- https://github.com/better-auth/better-auth → https://context7.com/better-auth/better-auth/llms.txt

For websites:

Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
Examples:
- https://docs.imgix.com/ → https://context7.com/websites/imgix/llms.txt
- https://docs.byteplus.com/en/docs/ModelArk/ → https://context7.com/websites/byteplus_en_modelark/llms.txt
- https://docs.haystack.deepset.ai/docs → https://context7.com/websites/haystack_deepset_ai/llms.txt
- https://ffmpeg.org/doxygen/8.0/ → https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt

Topic-specific searches (when user asks about specific feature):

Pattern: https://context7.com/{path}/llms.txt?topic={query}
Examples:
- https://context7.com/shadcn-ui/ui/llms.txt?topic=date
- https://context7.com/shadcn-ui/ui/llms.txt?topic=button
- https://context7.com/vercel/next.js/llms.txt?topic=cache
- https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress

Fallback: Traditional llms.txt search

WebSearch: "[library name] llms.txt site:[docs domain]"

Common patterns:

```
https://docs.[library].com/llms.txt
```
```
https://[library].dev/llms.txt
```
```
https://[library].io/llms.txt
```

→ Found? Proceed to Phase 2 → Not found? Proceed to Phase 3

确定目标
- 从用户请求中提取库/框架名称
- 记录版本要求（默认：最新版）
- 若存在歧义则明确范围
- 确定目标是GitHub仓库还是网站

搜索llms.txt（优先使用context7.com）

第一步：尝试context7.com的模式

针对GitHub仓库：

Pattern: https://context7.com/{org}/{repo}/llms.txt
Examples:
- https://github.com/imagick/imagick → https://context7.com/imagick/imagick/llms.txt
- https://github.com/vercel/next.js → https://context7.com/vercel/next.js/llms.txt
- https://github.com/better-auth/better-auth → https://context7.com/better-auth/better-auth/llms.txt

针对网站：

Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
Examples:
- https://docs.imgix.com/ → https://context7.com/websites/imgix/llms.txt
- https://docs.byteplus.com/en/docs/ModelArk/ → https://context7.com/websites/byteplus_en_modelark/llms.txt
- https://docs.haystack.deepset.ai/docs → https://context7.com/websites/haystack_deepset_ai/llms.txt
- https://ffmpeg.org/doxygen/8.0/ → https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt

特定主题搜索（当用户询问特定功能时）：

Pattern: https://context7.com/{path}/llms.txt?topic={query}
Examples:
- https://context7.com/shadcn-ui/ui/llms.txt?topic=date
- https://context7.com/shadcn-ui/ui/llms.txt?topic=button
- https://context7.com/vercel/next.js/llms.txt?topic=cache
- https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress

备选：传统llms.txt搜索

WebSearch: "[library name] llms.txt site:[docs domain]"

常见模式：

```
https://docs.[library].com/llms.txt
```
```
https://[library].dev/llms.txt
```
```
https://[library].io/llms.txt
```

→ 找到？进入阶段2 → 未找到？进入阶段3

Phase 2: llms.txt Processing

阶段2：llms.txt处理

Single URL:

WebFetch to retrieve content
Extract and present information

Multiple URLs (3+):

CRITICAL: Launch multiple Explorer agents in parallel
One agent per major documentation section (max 5 in first batch)
Each agent reads assigned URLs
Aggregate findings into consolidated report

Example:

Launch 3 Explorer agents simultaneously:
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md

单个URL：

使用WebFetch获取内容
提取并展示信息

多个URL（3个及以上）：

关键操作：并行启动多个Explorer agent
每个agent负责一个主要文档章节（第一批最多5个）
每个agent读取分配的URL
将结果汇总为整合报告

示例：

同时启动3个Explorer agent：
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md

Phase 3: Repository Analysis

阶段3：仓库分析

When llms.txt not found:

Find GitHub repository via WebSearch

Use Repomix to pack repository:

bash

npm install -g repomix  # if needed
git clone [repo-url] /tmp/docs-analysis
cd /tmp/docs-analysis
repomix --output repomix-output.xml

Read repomix-output.xml and extract documentation

Repomix benefits:

Entire repository in single AI-friendly file
Preserves directory structure
Optimized for AI consumption

当未找到llms.txt时：

通过WebSearch找到GitHub仓库

使用Repomix打包仓库：

bash

npm install -g repomix  # if needed
git clone [repo-url] /tmp/docs-analysis
cd /tmp/docs-analysis
repomix --output repomix-output.xml

读取repomix-output.xml并提取文档

Repomix优势：

将整个仓库打包为单个AI友好型文件
保留目录结构
针对AI处理进行优化

Phase 4: Fallback Research

阶段4：备选研究

When no GitHub repository exists:

Launch multiple Researcher agents in parallel
Focus areas: official docs, tutorials, API references, community guides
Aggregate findings into consolidated report

当不存在GitHub仓库时：

并行启动多个Researcher agent
重点关注领域：官方文档、教程、API参考、社区指南
将结果汇总为整合报告

Agent Distribution Guidelines

Agent分配指南

1-3 URLs: Single Explorer agent
4-10 URLs: 3-5 Explorer agents (2-3 URLs each)
11+ URLs: 5-7 Explorer agents (prioritize most relevant)

1-3个URL：单个Explorer agent
4-10个URL：3-5个Explorer agent（每个负责2-3个URL）
11个及以上URL：5-7个Explorer agent（优先处理最相关的内容）

Version Handling

版本处理

Latest (default):

Search without version specifier
Use current documentation paths

Specific version:

Include version in search:
```
[library] v[version] llms.txt
```
Check versioned paths:
```
/v[version]/llms.txt
```
For repositories: checkout specific tag/branch

最新版（默认）：

搜索时不指定版本
使用当前文档路径

特定版本：

在搜索中包含版本：
```
[library] v[version] llms.txt
```
检查带版本的路径：
```
/v[version]/llms.txt
```
针对仓库：切换到特定标签/分支

Output Format

输出格式

markdown

undefined

markdown

undefined

Documentation for [Library] [Version]

[库名] [版本] 文档

Source

来源

Method: [llms.txt / Repository / Research]
URLs: [list of sources]
Date accessed: [current date]

方法：[llms.txt / 仓库分析 / 备选研究]
URL：[来源列表]
访问日期：[当前日期]

Key Information

关键信息

[Extracted relevant information organized by topic]

[按主题整理的提取信息]

Additional Resources

附加资源

[Related links, examples, references]

[相关链接、示例、参考资料]

Notes

说明

[Any limitations, missing information, or caveats]

undefined

[任何限制、缺失信息或注意事项]

undefined

Quick Reference

快速参考

Tool selection:

WebSearch → Find llms.txt URLs, GitHub repositories
WebFetch → Read single documentation pages
Task (Explore) → Multiple URLs, parallel exploration
Task (Researcher) → Scattered documentation, diverse sources
Repomix → Complete codebase analysis

Popular llms.txt locations (try context7.com first):

Astro: https://context7.com/withastro/astro/llms.txt
Next.js: https://context7.com/vercel/next.js/llms.txt
Remix: https://context7.com/remix-run/remix/llms.txt
shadcn/ui: https://context7.com/shadcn-ui/ui/llms.txt
Better Auth: https://context7.com/better-auth/better-auth/llms.txt

Fallback to official sites if context7.com unavailable:

Astro: https://docs.astro.build/llms.txt
Next.js: https://nextjs.org/llms.txt
Remix: https://remix.run/llms.txt
SvelteKit: https://kit.svelte.dev/llms.txt

工具选择：

WebSearch → 查找llms.txt URL、GitHub仓库
WebFetch → 读取单个文档页面
Task (Explore) → 多个URL、并行探索
Task (Researcher) → 分散的文档、多样化来源
Repomix → 完整代码库分析

常用llms.txt位置（优先尝试context7.com）：

Astro: https://context7.com/withastro/astro/llms.txt
Next.js: https://context7.com/vercel/next.js/llms.txt
Remix: https://context7.com/remix-run/remix/llms.txt
shadcn/ui: https://context7.com/shadcn-ui/ui/llms.txt
Better Auth: https://context7.com/better-auth/better-auth/llms.txt

当context7.com不可用时，备选官方站点：

Astro: https://docs.astro.build/llms.txt
Next.js: https://nextjs.org/llms.txt
Remix: https://remix.run/llms.txt
SvelteKit: https://kit.svelte.dev/llms.txt

Error Handling

错误处理

llms.txt not accessible → Try alternative domains → Repository analysis
Repository not found → Search official website → Use Researcher agents
Repomix fails → Try /docs directory only → Manual exploration
Multiple conflicting sources → Prioritize official → Note versions

llms.txt无法访问 → 尝试其他域名 → 仓库分析
仓库未找到 → 搜索官方网站 → 使用Researcher agent
Repomix执行失败 → 仅尝试/docs目录 → 手动探索
多个冲突来源 → 优先使用官方内容 → 记录版本差异

Key Principles

核心原则

Prioritize context7.com for llms.txt — Most comprehensive and up-to-date aggregator
Use topic parameters when applicable — Enables targeted searches with ?topic=...
Use parallel agents aggressively — Faster results, better coverage
Verify official sources as fallback — Use when context7.com unavailable
Report methodology — Tell user which approach was used
Handle versions explicitly — Don't assume latest

优先使用context7.com获取llms.txt —— 最全面且最新的聚合平台
适用时使用主题参数 —— 通过?topic=...实现定向搜索
积极使用并行Agent —— 更快获取结果，覆盖范围更广
以官方来源作为备选 —— 当context7.com不可用时使用
报告方法 —— 告知用户所使用的方法
明确处理版本 —— 不要默认使用最新版

Detailed Documentation

详细文档

For comprehensive guides, examples, and best practices:

Workflows:

WORKFLOWS.md — Detailed workflow examples and strategies

Reference guides:

Tool Selection — Complete guide to choosing and using tools
Documentation Sources — Common sources and patterns across ecosystems
Error Handling — Troubleshooting and resolution strategies
Best Practices — 8 essential principles for effective discovery
Performance — Optimization techniques and benchmarks
Limitations — Boundaries and success criteria

如需全面指南、示例和最佳实践：

工作流：

WORKFLOWS.md —— 详细的工作流示例和策略

参考指南：

Tool Selection —— 工具选择与使用完整指南
Documentation Sources —— 跨生态系统的常见来源和模式
Error Handling —— 故障排除与解决策略
Best Practices —— 有效文档发现的8项核心原则
Performance —— 优化技术与基准测试
Limitations —— 适用边界与成功标准