osgrep-reference
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseosgrep: Semantic Code Search
osgrep:语义代码搜索
Prefer osgrep over grep/rg for conceptual code exploration—it finds code by meaning, not just string matching. For exact identifier or literal string searches, grep/rg remains appropriate.
在进行概念性代码探索时,优先使用osgrep而非grep/rg——它通过代码含义而非单纯的字符串匹配来查找代码。若要搜索确切的标识符或字面字符串,grep/rg仍是合适的选择。
Overview
概述
osgrep is a natural-language semantic code search tool that finds code by concept rather than keyword matching. Unlike which matches literal strings, osgrep understands code semantics using local AI embeddings.
grepVersion 0.5.16 (Dec 2025) highlights:
- command: Compress files to function/class signatures (~85% token reduction)
skeleton - command: Show who calls/what calls for any symbol (call graph)
trace - command: List all indexed symbols with definitions
symbols - command: Health/integrity verification
doctor - command: Display all indexed repositories
list - Per-project directories (no longer global
.osgrep/)~/.osgrep/data - V2 architecture with improved performance (~20% token savings, ~30% speedup)
- Go language support
- flag for clean re-indexing
--reset - ColBERT reranking for better result relevance
- Role detection: distinguishes orchestration logic from type definitions
- Split searching: separate "Code" and "Docs" indices
When to use osgrep:
- Exploring unfamiliar codebases ("where is the auth logic?")
- Finding conceptual patterns ("show me error handling")
- Locating cross-cutting concerns ("all database migrations")
- User explicitly asks to search code semantically
When to use traditional tools:
- Searching for exact strings or identifiers (use )
Grep - Finding files by name pattern (use )
Glob - Already know the exact location (use )
Read
osgrep是一款自然语言语义代码搜索工具,它通过概念而非关键词匹配来查找代码。与匹配字面字符串的不同,osgrep利用本地AI嵌入理解代码语义。
grep版本0.5.16(2025年12月)亮点:
- 命令:将文件压缩为函数/类签名(约减少85%的令牌数量)
skeleton - 命令:显示任意符号的调用方与被调用方(调用图)
trace - 命令:列出所有已索引的符号及其定义
symbols - 命令:健康/完整性验证
doctor - 命令:显示所有已索引的仓库
list - 每个项目独立的目录(不再使用全局
.osgrep/)~/.osgrep/data - V2架构,性能提升(约减少20%令牌使用,速度提升约30%)
- 支持Go语言
- 标志:用于完全重新索引
--reset - ColBERT重排序:提升结果相关性
- 角色检测:区分编排逻辑与类型定义
- 拆分搜索:独立的「代码」和「文档」索引
何时使用osgrep:
- 探索不熟悉的代码库(如“认证逻辑在哪里?”)
- 查找概念性模式(如“展示错误处理代码”)
- 定位横切关注点(如“所有数据库迁移代码”)
- 用户明确要求进行语义化代码搜索
何时使用传统工具:
- 搜索确切字符串或标识符(使用)
Grep - 按名称模式查找文件(使用)
Glob - 已知道代码的确切位置(使用)
Read
Quick Start
快速开始
Run osgrep from within the project directory, since it uses per-project indexes.
.osgrep/bash
cd /path/to/project # REQUIRED: cd into the project first
osgrep "your query" # Now search works请在项目目录内运行osgrep,因为它使用每个项目独立的索引。
.osgrep/bash
cd /path/to/project # 必须:先进入项目目录
osgrep "your query" # 现在可以进行搜索Basic Search
基础搜索
bash
osgrep "your semantic query"
osgrep search "your query" path/to/scope # Scope to subdirectory
osgrep skeleton src/file.py # Compress file to signatures
osgrep trace functionName # Show call graph
osgrep symbols # List all symbolsExamples:
bash
osgrep "user registration flow"
osgrep "webhook signature validation"
osgrep "database transaction handling"
osgrep "how are plugins loaded" packages/srcbash
osgrep "你的语义查询"
osgrep search "你的查询" path/to/scope # 限定搜索子目录
osgrep skeleton src/file.py # 将文件压缩为签名
osgrep trace functionName # 显示调用图
osgrep symbols # 列出所有符号示例:
bash
osgrep "用户注册流程"
osgrep "Webhook签名验证"
osgrep "数据库事务处理"
osgrep "插件如何加载" packages/srcOutput Format
输出格式
Returns results in this format:
IMPLEMENTATION path/to/file:line
Score: 0.95
Preamble:
[code snippet or content preview]
...- IMPLEMENTATION: Tag indicating the type of match
- Score: Relevance score (0-1, higher is better)
- ...: Truncation marker—snippet is incomplete, use for full context
Read
返回结果格式如下:
IMPLEMENTATION path/to/file:line
Score: 0.95
Preamble:
[代码片段或内容预览]
...- IMPLEMENTATION:匹配类型的标签
- Score:相关性分数(0-1,分数越高相关性越好)
- ...:截断标记——片段不完整,使用查看完整上下文
Read
Search Strategy
搜索策略
For Architectural/System-Level Questions
针对架构/系统级问题
Use for: auth, integrations, file watching, cross-cutting concerns
-
Search broadly first to map the landscape:bash
osgrep "authentication authorization checks" -
Survey the results - look for patterns across multiple files:
- Are checks in middleware? Decorators? Multiple services?
- Do file paths suggest different layers (gateway, handlers, utils)?
-
Read strategically - pick 2-4 files that represent different aspects:
- Read the main entry point
- Read representative middleware/util files
- Follow imports if architecture is unclear
-
Refine with specific searches if one aspect is unclear:bash
osgrep "session validation logic" osgrep "API authentication middleware"
适用于:认证、集成、文件监听、横切关注点
-
先进行宽泛搜索以梳理整体架构:bash
osgrep "authentication authorization checks" -
分析结果——查找多个文件中的模式:
- 检查逻辑是否在中间件、装饰器或多个服务中?
- 文件路径是否暗示不同的层级(网关、处理器、工具类)?
-
有策略地阅读——选择2-4个代表不同方面的文件:
- 阅读主入口文件
- 阅读具有代表性的中间件/工具类文件
- 若架构不清晰,跟随导入关系探索
-
若某方面不清晰,使用特定搜索细化:bash
osgrep "session validation logic" osgrep "API authentication middleware"
For Targeted Implementation Details
针对具体实现细节
Use for: specific function, algorithm, single feature
-
Search specifically about the precise logic:bash
osgrep "logic for merging user and default configuration" -
Evaluate the semantic match:
- Does the snippet look relevant?
- If it ends in or cuts off mid-logic, read the file
...
-
One search, one read: Use osgrep to pinpoint the best file, then read it fully.
适用于:特定函数、算法、单一功能
-
针对精确逻辑进行具体搜索:bash
osgrep "logic for merging user and default configuration" -
评估语义匹配度:
- 片段看起来相关吗?
- 如果片段以结尾或逻辑被截断,请阅读完整文件
...
-
一次搜索,一次阅读:使用osgrep定位最佳文件,然后完整阅读该文件
CLI Reference
CLI参考
Search Options
搜索选项
Control result count:
bash
osgrep "validation logic" -m 20 # Max 20 results total (default: 10)
osgrep "validation logic" --per-file 3 # Up to 3 matches per file (default: 1)Output formats:
bash
osgrep "API endpoints" --compact # File paths only
osgrep "API endpoints" --content # Full chunk content (not just snippets)
osgrep "API endpoints" --scores # Show relevance scores
osgrep "API endpoints" --plain # Disable ANSI colorsSync before search:
bash
osgrep "validation logic" -s # Sync files to index before searching
osgrep "validation logic" -d # Dry run (show what would sync)控制结果数量:
bash
osgrep "validation logic" -m 20 # 最多返回20条结果(默认:10条)
osgrep "validation logic" --per-file 3 # 每个文件最多返回3个匹配(默认:1个)输出格式:
bash
osgrep "API endpoints" --compact # 仅显示文件路径
osgrep "API endpoints" --content # 显示完整块内容(而非仅片段)
osgrep "API endpoints" --scores # 显示相关性分数
osgrep "API endpoints" --plain # 禁用ANSI颜色搜索前同步索引:
bash
osgrep "validation logic" -s # 搜索前将文件同步到索引
osgrep "validation logic" -d # 试运行(显示将同步的内容)Index Management
索引管理
bash
osgrep index # Incremental update
osgrep index -r # Full re-index from scratch (--reset)
osgrep index -p /path/to/repo # Index a specific directory
osgrep index -d # Preview what would be indexed (--dry-run)bash
osgrep index # 增量更新索引
osgrep index -r # 从头开始完全重新索引(--reset)
osgrep index -p /path/to/repo # 索引特定目录
osgrep index -d # 预览将被索引的内容(--dry-run)Advanced Commands (v0.5+)
高级命令(v0.5+)
Skeleton - Compress files to signatures:
bash
osgrep skeleton src/server.py # Show function/class signatures only
osgrep skeleton src/server.py --no-summary # Omit call/complexity summaries
osgrep skeleton "auth logic" -l 5 # Query mode: skeleton of top 5 matching filesOutput shows: function signatures with summaries inside bodies.
# → calls | C:N | ORCHTrace - Show call graph:
bash
osgrep trace handleRequest # Who calls this? What does it call?Symbols - List all indexed symbols:
bash
osgrep symbols # All symbols (default limit: 20)
osgrep symbols "Request" # Filter by pattern
osgrep symbols -p src/api/ -l 50 # Filter by path, increase limitSkeleton - 将文件压缩为签名:
bash
osgrep skeleton src/server.py # 仅显示函数/类签名
osgrep skeleton src/server.py --no-summary # 省略调用/复杂度摘要
osgrep skeleton "auth logic" -l 5 # 查询模式:显示前5个匹配文件的签名输出内容:函数签名,内部包含格式的摘要。
# → calls | C:N | ORCHTrace - 显示调用图:
bash
osgrep trace handleRequest # 谁调用了该函数?该函数调用了什么?Symbols - 列出所有已索引的符号:
bash
osgrep symbols # 所有符号(默认限制:20个)
osgrep symbols "Request" # 按模式过滤
osgrep symbols -p src/api/ -l 50 # 按路径过滤,增加结果限制数量Other Commands
其他命令
bash
osgrep list # Show all indexed repositories
osgrep doctor # Check health and configuration
osgrep setup # Pre-download models (~150MB)
osgrep serve # Run background daemon (port 4444)
osgrep serve -p 8080 # Custom port (or OSGREP_PORT=8080)
osgrep serve -b # Run in background (--background)
osgrep serve status # Check if daemon is running
osgrep serve stop # Stop daemon
osgrep serve stop --all # Stop all daemonsServe endpoints:
- - Health check
GET /health - - Search with
POST /search{ query, limit, path, rerank } - Lock file: with
.osgrep/server.json/portpid
bash
osgrep list # 显示所有已索引的仓库
osgrep doctor # 检查健康状态与配置
osgrep setup # 预下载模型(约150MB)
osgrep serve # 运行后台守护进程(端口4444)
osgrep serve -p 8080 # 自定义端口(或设置环境变量OSGREP_PORT=8080)
osgrep serve -b # 在后台运行(--background)
osgrep serve status # 检查守护进程是否运行
osgrep serve stop # 停止守护进程
osgrep serve stop --all # 停止所有守护进程Serve端点:
- - 健康检查
GET /health - - 使用
POST /search参数进行搜索{ query, limit, path, rerank } - 锁定文件:,包含
.osgrep/server.json/port信息pid
Claude Code Integration
Claude Code集成
bash
osgrep install-claude-code # Install as Claude Code plugin
osgrep install-opencode # Install for OpencodeBoth plugins automatically manage the background server lifecycle during sessions.
bash
osgrep install-claude-code # 安装为Claude Code插件
osgrep install-opencode # 为Opencode安装插件这两个插件会在会话期间自动管理后台服务器的生命周期。
Common Search Patterns
常见搜索模式
Architecture Exploration
架构探索
bash
undefinedbash
undefinedMental processes (Open Souls / Daimonic)
思维流程(Open Souls / Daimonic)
osgrep "mental processes that orchestrate conversation flow"
osgrep "subprocesses that learn about the user"
osgrep "cognitive steps using structured output"
osgrep "mental processes that orchestrate conversation flow"
osgrep "subprocesses that learn about the user"
osgrep "cognitive steps using structured output"
React/Next.js
React/Next.js
osgrep "where do we fetch data in components?"
osgrep "custom hooks for API calls"
osgrep "protected route implementation"
osgrep "where do we fetch data in components?"
osgrep "custom hooks for API calls"
osgrep "protected route implementation"
Backend
后端
osgrep "request validation middleware"
osgrep "authentication flow"
osgrep "rate limiting logic"
undefinedosgrep "request validation middleware"
osgrep "authentication flow"
osgrep "rate limiting logic"
undefinedBusiness Logic
业务逻辑
bash
osgrep "payment processing"
osgrep "notification sending"
osgrep "user permission checks"
osgrep "order fulfillment workflow"bash
osgrep "payment processing"
osgrep "notification sending"
osgrep "user permission checks"
osgrep "order fulfillment workflow"Cross-Cutting Concerns
横切关注点
bash
osgrep "error handling patterns"
osgrep "logging configuration"
osgrep "database migrations"
osgrep "environment variable usage"bash
osgrep "error handling patterns"
osgrep "logging configuration"
osgrep "database migrations"
osgrep "environment variable usage"Tips for Effective Queries
高效查询技巧
Trust the Semantics
信任语义匹配
You don't need exact names. Conceptual queries work better:
bash
undefined无需使用确切名称,概念性查询效果更好:
bash
undefinedGood - conceptual
推荐:概念性查询
osgrep "how does the server start"
osgrep "component state management"
osgrep "how does the server start"
osgrep "component state management"
Less effective - too literal
效果较差:过于字面化
osgrep "server.init"
osgrep "useState"
undefinedosgrep "server.init"
osgrep "useState"
undefinedBe Specific
明确查询意图
bash
undefinedbash
undefinedToo vague
过于模糊
osgrep "code"
osgrep "code"
Clear intent
意图清晰
osgrep "user registration validation logic"
undefinedosgrep "user registration validation logic"
undefinedUse Natural Language
使用自然语言
bash
osgrep "how do we handle payment failures?"
osgrep "what happens when a webhook arrives?"
osgrep "where is user input sanitized?"bash
osgrep "how do we handle payment failures?"
osgrep "what happens when a webhook arrives?"
osgrep "where is user input sanitized?"Watch for Distributed Patterns
留意分布式模式
If results span 5+ files in different directories, the feature is likely architectural—survey before diving deep.
如果结果分布在5个以上不同目录的文件中,该功能很可能是架构级的——先整体梳理再深入研究。
Don't Over-Rely on Snippets
不要过度依赖代码片段
For architectural questions, snippets are signposts, not answers. Read the key files.
对于架构级问题,代码片段只是指引,而非答案。请阅读关键文件。
Technical Details
技术细节
- 100% Local: Uses transformers.js embeddings (no remote API calls)
- Auto-Isolated: Each repo gets its own index in directory (v0.5+)
.osgrep/ - Adaptive Performance: Bounded concurrency keeps system responsive
- Index Location: in project root (was
.osgrep/in v0.4.x)~/.osgrep/data/ - Model Download: ~150MB on first run (to pre-download)
osgrep setup - Chunking Strategy: Tree-sitter parses code into function/class boundaries
- Deduplication: Identical code blocks are deduplicated
- Dual Channels: Separate "Code" and "Docs" indices with ColBERT reranking
- Structural Boosting: Functions/classes prioritized over test files
- Skeleton Compression: ~85% token reduction when viewing file structure
- 100%本地运行:使用transformers.js嵌入(无远程API调用)
- 自动隔离:每个仓库在目录中拥有独立索引(v0.5+)
.osgrep/ - 自适应性能:限制并发数以保持系统响应性
- 索引位置:项目根目录下的(v0.4.x版本中为
.osgrep/)~/.osgrep/data/ - 模型下载:首次运行时需下载约150MB模型(可使用预下载)
osgrep setup - 分块策略:Tree-sitter将代码解析为函数/类边界
- 去重:相同的代码块会被去重
- 双渠道:独立的「代码」和「文档」索引,搭配ColBERT重排序
- 结构加权:优先返回函数/类,而非测试文件
- Skeleton压缩:查看文件结构时可减少约85%的令牌数量
Troubleshooting
故障排查
"Still Indexing..." message:
- Index is ongoing. Results will be partial until complete.
- Alert the user and ask if they wish to proceed.
Slow first search:
- Expected—indexing takes 30-60s for medium repos
- Use to pre-download models
osgrep setup
Index out of date:
- Run to refresh
osgrep index - Run for a complete re-index
osgrep index --reset - osgrep usually auto-detects changes
Installation issues:
bash
osgrep doctor # Diagnose problems
npm install -g osgrep # Reinstall if neededNo results found:
- Try broader queries ("authentication" vs "JWT middleware")
- Ensure index is up to date ()
osgrep index - Verify you're in the correct repository directory
显示「Still Indexing...」消息:
- 索引正在进行中,完成前结果会不完整。
- 告知用户并询问是否继续。
首次搜索速度慢:
- 这是正常现象——中等规模仓库的索引需要30-60秒
- 使用预下载模型
osgrep setup
索引过期:
- 运行刷新索引
osgrep index - 运行进行完全重新索引
osgrep index --reset - osgrep通常会自动检测文件变化
安装问题:
bash
osgrep doctor # 诊断问题
npm install -g osgrep # 如有需要重新安装未找到结果:
- 尝试更宽泛的查询(如“authentication”而非“JWT middleware”)
- 确保索引是最新的(运行)
osgrep index - 确认你在正确的仓库目录中