indexion-grep
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseindexion grep
indexion grep
KGF-aware token pattern search, structural queries, and vector similarity search.
支持KGF的token模式搜索、结构查询和向量相似度搜索。
When to Use
适用场景
- User asks to find specific code patterns (e.g. "nested for loops", "pub fn without docs")
- User asks "where is this function used?" or "find all pub structs"
- User wants structural search — not regex on raw text, but token-level matching
- User asks for semantic code search ("find functions that parse configuration")
- User wants to find proxy functions, long functions, or functions with many params
- Replaces manual grep/ripgrep for code-aware searches
- Use instead of Explore agent for targeted codebase queries
- 用户要求查找特定代码模式(例如「嵌套for循环」、「无文档的pub fn」)
- 用户询问「这个函数在哪里被使用?」或「查找所有pub结构体」
- 用户需要结构搜索——不是对原始文本做正则匹配,而是token级别的匹配
- 用户需要语义代码搜索(「查找用于解析配置的函数」)
- 用户需要查找代理函数、长函数或参数过多的函数
- 可替代手动grep/ripgrep做代码感知搜索
- 针对定向代码库查询,可替代Explore Agent使用
Token Pattern Search
Token模式搜索
Patterns are space-separated token matchers. KGF aliases resolve automatically
(e.g. → ), so you write natural code keywords:
pubKW_pubbash
undefined模式是空格分隔的token匹配器,KGF别名会自动解析(例如 → ),因此你可以直接写自然的代码关键字:
pubKW_pubbash
undefinedFind all pub fn declarations
Find all pub fn declarations
indexion grep "pub fn *" src/
indexion grep "pub fn *" src/
Find pub struct definitions
Find pub struct definitions
indexion grep "pub struct *" src/
indexion grep "pub struct *" src/
Nested for loops (O(n²) candidates)
Nested for loops (O(n²) candidates)
indexion grep "for ... for" src/
indexion grep "for ... for" src/
Functions named "sort"
Functions named "sort"
indexion grep "fn Ident:sort" src/
indexion grep "fn Ident:sort" src/
Any token except pub, followed by fn
Any token except pub, followed by fn
indexion grep "!pub fn" src/
indexion grep "!pub fn" src/
Using raw KGF token kinds also works
Using raw KGF token kinds also works
indexion grep "KW_pub KW_fn Ident" src/
undefinedindexion grep "KW_pub KW_fn Ident" src/
undefinedPattern Syntax
模式语法
| Pattern | Meaning |
|---|---|
| Match keyword (auto-alias → |
| Match token kind exactly |
| Match kind and text |
| Match any single token |
| Match zero or more tokens (non-greedy) |
| Negation — any token except this kind |
| Punctuation aliases |
Aliases are auto-generated from KGF specs — not hardcoded. Works for all
KGF-supported languages.
| 模式 | 含义 |
|---|---|
| 匹配关键字(自动别名→ |
| 精确匹配token类型 |
| 匹配类型和文本 |
| 匹配任意单个token |
| 匹配0个或多个token(非贪婪模式) |
| 取反——匹配除此类型外的任意token |
| 标点符号别名 |
别名是从KGF规范自动生成的,并非硬编码,适用于所有KGF支持的语言。
Semantic Queries
语义查询
Structural analysis beyond token patterns:
bash
undefined超出token模式的结构分析能力:
bash
undefinedFind proxy functions (wrappers that just delegate)
Find proxy functions (wrappers that just delegate)
indexion grep --semantic=proxy src/
indexion grep --semantic=proxy src/
Find long functions (30+ lines)
Find long functions (30+ lines)
indexion grep --semantic=long:30 src/
indexion grep --semantic=long:30 src/
Find short functions (3 lines or less)
Find short functions (3 lines or less)
indexion grep --semantic=short:3 src/
indexion grep --semantic=short:3 src/
Find functions with 4+ parameters
Find functions with 4+ parameters
indexion grep --semantic=params-gte:4 src/
indexion grep --semantic=params-gte:4 src/
Find functions by name substring
Find functions by name substring
indexion grep --semantic=name:sort src/
indexion grep --semantic=name:sort src/
Find undocumented pub declarations (also available as --undocumented)
Find undocumented pub declarations (also available as --undocumented)
indexion grep --undocumented src/
indexion grep --semantic=undocumented src/
undefinedindexion grep --undocumented src/
indexion grep --semantic=undocumented src/
undefinedVector Similarity Search
向量相似度搜索
Find code by natural language description using TF-IDF embeddings
(shared infrastructure with ):
digestbash
undefined使用TF-IDF嵌入,通过自然语言描述查找代码(与共用基础设施):
digestbash
undefinedFind functions related to "parse JSON configuration"
Find functions related to "parse JSON configuration"
indexion grep --semantic="similar:parse JSON configuration" src/
indexion grep --semantic="similar:parse JSON configuration" src/
Find tokenization-related code
Find tokenization-related code
indexion grep --semantic="similar:tokenize source code into tokens" src/
indexion grep --semantic="similar:tokenize source code into tokens" src/
Find error handling patterns
Find error handling patterns
indexion grep --semantic="similar:handle error and return" src/
Results are ranked by cosine similarity score.indexion grep --semantic="similar:handle error and return" src/
结果按余弦相似度得分排序。Output Control
输出控制
bash
undefinedbash
undefinedFile paths only
File paths only
indexion grep --files "pub fn *" src/
indexion grep --files "pub fn *" src/
Match count per file
Match count per file
indexion grep --count "pub fn *" src/
indexion grep --count "pub fn *" src/
Context lines around matches
Context lines around matches
indexion grep --context=3 "for ... for" src/
indexion grep --context=3 "for ... for" src/
Include/exclude patterns
Include/exclude patterns
indexion grep --include='.mbt' --exclude='_test.mbt' "pub fn *" src/
undefinedindexion grep --include='.mbt' --exclude='_test.mbt' "pub fn *" src/
undefinedOptions
选项
| Option | Default | Description |
|---|---|---|
| — | Semantic query (see above) |
| false | Find pub declarations without doc comments |
| false | Show matching file paths only |
| false | Show match count per file only |
| 0 | Lines of context around matches |
| — | Include file pattern (repeatable) |
| — | Exclude file pattern (repeatable) |
| kgfs | KGF specs directory |
| 选项 | 默认值 | 描述 |
|---|---|---|
| — | 语义查询(见上文) |
| false | 查找无文档注释的pub声明 |
| false | 仅展示匹配的文件路径 |
| false | 仅展示每个文件的匹配次数 |
| 0 | 匹配结果前后的上下文行数 |
| — | 包含的文件模式(可重复使用) |
| — | 排除的文件模式(可重复使用) |
| kgfs | KGF规范目录 |
Relationship to Other Commands
与其他命令的关系
| Command | Purpose | When to use |
|---|---|---|
| Find specific patterns/functions | "Find all nested for loops" |
| File-level similarity matrix | "What files are similar?" |
| Actionable refactoring plan | "What duplicates should I fix?" |
| Proxy function detection + auto-fix | "Remove unnecessary wrappers" |
| Purpose-based function index | "Find function that handles X" (requires build step) |
grep --semantic=proxyplan unwrapplan unwrapgrepplan unwrapgrep --semantic=similar:...digest querydigest| 命令 | 用途 | 使用时机 |
|---|---|---|
| 查找特定模式/函数 | 「查找所有嵌套for循环」 |
| 文件级相似度矩阵 | 「哪些文件是相似的?」 |
| 可落地的重构方案 | 「我应该修复哪些重复代码?」 |
| 代理函数检测+自动修复 | 「移除不必要的包装函数」 |
| 基于用途的函数索引 | 「查找处理X功能的函数」(需要构建步骤) |
grep --semantic=proxyplan unwrapplan unwrapgrepplan unwrapgrep --semantic=similar:...digest querydigestDogfooding Workflow
内部使用工作流
bash
undefinedbash
undefinedAfter writing new code, check for patterns that need attention:
写完新代码后,检查需要关注的模式:
1. Find potential O(n²) sorts (nested loops)
1. 查找潜在O(n²)排序(嵌套循环)
indexion grep "for ... for" src/
indexion grep "for ... for" src/
2. Check for undocumented public API
2. 检查未文档化的公开API
indexion grep --undocumented src/
indexion grep --undocumented src/
3. Find proxy functions to consider unwrapping
3. 查找可考虑拆解的代理函数
indexion grep --semantic=proxy src/
indexion grep --semantic=proxy src/
4. Find overly long functions
4. 查找过长的函数
indexion grep --semantic=long:50 src/
indexion grep --semantic=long:50 src/
5. Search for specific refactoring targets by similarity
5. 按相似度查找特定的重构目标
indexion grep --semantic="similar:extract substring" src/
indexion grep --semantic="similar:extract substring" src/
6. Trace all references to a type before moving it
6. 移动类型前追踪该类型的所有引用
indexion grep "TypeIdent:TfidfEmbeddingProvider" src/
indexion grep "TypeIdent:TfidfEmbeddingProvider" src/
7. Find all sort-related functions across the codebase
7. 查找整个代码库中所有与排序相关的函数
indexion grep --semantic=name:sort src/
indexion grep --semantic=name:sort src/
8. Verify a refactoring didn't leave orphan references
8. 验证重构没有遗留孤立引用
indexion grep "Ident:old_function_name" src/ cmd/indexion/
undefinedindexion grep "Ident:old_function_name" src/ cmd/indexion/
undefinedDogfooding Lessons
内部使用经验
- Use instead of Explore agent: is faster and more precise than spawning an agent to search for a type definition.
grep "TypeIdent:X" - Alias resolution is automatic: You don't need to know that maps to
pub— just write the keyword as it appears in source code.KW_pub - is non-greedy:
...finds the closest pair of for loops, not the furthest. This is usually what you want for finding nesting.for ... for - Vector search quality: works best with descriptive phrases. "parse JSON configuration" works better than just "json".
--semantic="similar:..." - Combine with plan refactor: Use to find all instances before consolidating, then
grep --semantic=name:Xto verify they're gone.plan refactor
- 可替代Explore Agent使用:比启动Agent查找类型定义更快、更精准。
grep "TypeIdent:X" - 别名自动解析:你不需要知道映射到
pub,直接写源代码中出现的关键字即可。KW_pub - 是非贪婪模式:
...会找到最近的一对for循环,而不是最远的,这通常是你查找嵌套时需要的效果。for ... for - 向量搜索质量:搭配描述性短语效果最好,「parse JSON configuration」比单独的「json」效果好。
--semantic="similar:..." - 可与重构计划搭配使用:合并前用查找所有实例,然后用
grep --semantic=name:X验证它们都已被移除。plan refactor