file-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

File Search Skill

文件搜索Skill

Search code efficiently using ripgrep for text patterns and ast-grep for structural code patterns.
使用ripgrep搜索文本模式,使用ast-grep搜索结构化代码模式,高效查找代码。

Purpose

用途

The file-search skill provides access to two powerful search tools pre-installed in MassGen environments:
  1. ripgrep (rg): Ultra-fast text search with regex support for finding strings, patterns, and text matches
  2. ast-grep (sg): Syntax-aware structural search for finding code patterns based on abstract syntax trees
Use these tools to understand codebases, find usage patterns, analyze impact of changes, and locate specific code constructs. Both tools are significantly faster than traditional
grep
or
find
commands.
文件搜索Skill提供对MassGen环境中预安装的两款强大搜索工具的访问权限:
  1. ripgrep (rg):支持正则表达式的超快速文本搜索工具,可用于查找字符串、模式和文本匹配项
  2. ast-grep (sg):支持语法感知的结构化搜索工具,可基于抽象语法树查找代码模式
使用这些工具来理解代码库、查找使用模式、分析变更影响以及定位特定代码结构。这两款工具的速度都显著快于传统的
grep
find
命令。

When to Use This Skill

何时使用此Skill

Use the file-search skill when:
  • Understanding a new codebase (finding entry points, key classes)
  • Finding all usages of a function, class, or variable before refactoring
  • Locating specific code patterns (error handling, API calls, etc.)
  • Searching for security issues (hardcoded credentials, SQL queries, eval usage)
  • Analyzing dependencies and imports
  • Finding TODOs, FIXMEs, or code comments
Choose ripgrep for:
  • Text-based searches (strings, comments, variable names)
  • Fast, simple pattern matching across many files
  • When the exact code structure doesn't matter
Choose ast-grep for:
  • Structural code searches (function signatures, class definitions)
  • Syntax-aware matching (understanding code semantics)
  • Complex refactoring (finding specific code patterns)
在以下场景中使用文件搜索Skill:
  • 理解新代码库(查找入口点、关键类)
  • 在重构前查找函数、类或变量的所有用法
  • 定位特定代码模式(错误处理、API调用等)
  • 搜索安全问题(硬编码凭证、SQL查询、eval用法)
  • 分析依赖关系和导入语句
  • 查找TODO、FIXME或代码注释
选择ripgrep的场景:
  • 基于文本的搜索(字符串、注释、变量名)
  • 在大量文件中进行快速、简单的模式匹配
  • 不关注代码确切结构的情况
选择ast-grep的场景:
  • 结构化代码搜索(函数签名、类定义)
  • 语法感知匹配(理解代码语义)
  • 复杂重构(查找特定代码模式)

Invoking Search Tools

调用搜索工具

In MassGen, use the
execute_command
tool to run ripgrep and ast-grep:
python
undefined
在MassGen中,使用
execute_command
工具运行ripgrep和ast-grep:
python
undefined

Using ripgrep

使用ripgrep

execute_command("rg 'pattern' --type py src/")
execute_command("rg 'pattern' --type py src/")

Using ast-grep

使用ast-grep

execute_command("sg --pattern 'class $NAME { $$$ }' --lang python")

Both tools are pre-installed in MassGen Docker containers and available via shell execution.
execute_command("sg --pattern 'class $NAME { $$$ }' --lang python")

这两款工具已预安装在MassGen Docker容器中,可通过shell执行访问。

Targeting Your Searches

精准定位搜索范围

CRITICAL: Always start with targeted, narrow searches to avoid overwhelming results. Getting thousands of matches makes analysis impossible and wastes tokens.
These strategies apply to both ripgrep and ast-grep.
重要提示:始终从针对性强、范围窄的搜索开始,避免返回过多结果。成千上万的匹配项会导致无法分析,还会浪费token。
以下策略适用于ripgrep和ast-grep

Scope-Limiting Strategies

范围限制策略

Apply these strategies from the start to target searches effectively:
  1. Specify File Types/Languages: Always filter by language
    bash
    # Ripgrep
    rg "function" --type py --type js
    
    # AST-grep
    sg --pattern 'function $NAME($$$) { $$$ }' --lang js
  2. Target Specific Directories: Search in likely locations first
    bash
    # Ripgrep
    rg "LoginService" src/services/
    
    # AST-grep
    sg --pattern 'class LoginService { $$$ }' src/services/
  3. Use Specific Patterns: Make patterns as specific as possible
    bash
    # Ripgrep: BAD - too broad
    rg "user"
    # Ripgrep: GOOD - more specific
    rg "class.*User.*Service" --type py
    
    # AST-grep: BAD - too broad
    sg --pattern '$X'
    # AST-grep: GOOD - more specific
    sg --pattern 'class $NAME extends UserService { $$$ }' --lang js
  4. Limit Result Count: Use head to cap results
    bash
    # Ripgrep
    rg "import" --type py | head -20
    rg "TODO" --count
    
    # AST-grep
    sg --pattern 'import $X from $Y' --lang js | head -20
从一开始就应用这些策略,有效定位搜索范围:
  1. 指定文件类型/语言:始终按语言筛选
    bash
    # Ripgrep
    rg "function" --type py --type js
    
    # AST-grep
    sg --pattern 'function $NAME($$$) { $$$ }' --lang js
  2. 定位特定目录:优先在可能的位置搜索
    bash
    # Ripgrep
    rg "LoginService" src/services/
    
    # AST-grep
    sg --pattern 'class LoginService { $$$ }' src/services/
  3. 使用特定模式:尽可能让模式更具体
    bash
    # Ripgrep:不佳 - 范围过宽
    rg "user"
    # Ripgrep:良好 - 更具体
    rg "class.*User.*Service" --type py
    
    # AST-grep:不佳 - 范围过宽
    sg --pattern '$X'
    # AST-grep:良好 - 更具体
    sg --pattern 'class $NAME extends UserService { $$$ }' --lang js
  4. 限制结果数量:使用head来限制结果
    bash
    # Ripgrep
    rg "import" --type py | head -20
    rg "TODO" --count
    
    # AST-grep
    sg --pattern 'import $X from $Y' --lang js | head -20

Progressive Search Refinement

渐进式搜索细化

When exploring unfamiliar code, use this workflow:
Ripgrep example:
bash
undefined
在探索不熟悉的代码时,使用以下工作流:
Ripgrep示例:
bash
undefined

Step 1: Count matches to assess scope

步骤1:统计匹配项以评估范围

rg "pattern" --count --type py
rg "pattern" --count --type py

Step 2: If too many results, add more filters

步骤2:如果结果过多,添加更多筛选条件

rg "pattern" --type py src/ --glob '!tests'
rg "pattern" --type py src/ --glob '!tests'

Step 3: Show limited results to inspect

步骤3:查看有限结果进行检查

rg "pattern" --type py src/ | head -30
rg "pattern" --type py src/ | head -30

Step 4: Once confirmed, get full results or target further

步骤4:确认后,获取完整结果或进一步定位

rg "pattern" --type py src/specific_module/

**AST-grep example:**
```bash
rg "pattern" --type py src/specific_module/

**AST-grep示例:**
```bash

Step 1: Assess scope with broad structural pattern

步骤1:使用宽泛的结构模式评估范围

sg --pattern 'function $NAME($$$) { $$$ }' --lang js | head -10
sg --pattern 'function $NAME($$$) { $$$ }' --lang js | head -10

Step 2: If too many results, narrow to specific directory

步骤2:如果结果过多,缩小到特定目录

sg --pattern 'function $NAME($$$) { $$$ }' --lang js src/
sg --pattern 'function $NAME($$$) { $$$ }' --lang js src/

Step 3: Make pattern more specific

步骤3:让模式更具体

sg --pattern 'async function $NAME($$$) { $$$ }' --lang js src/
sg --pattern 'async function $NAME($$$) { $$$ }' --lang js src/

Step 4: Target exact location

步骤4:定位确切位置

sg --pattern 'async function $NAME($$$) { $$$ }' --lang js src/services/
undefined
sg --pattern 'async function $NAME($$$) { $$$ }' --lang js src/services/
undefined

When You Get Too Many Results

结果过多时的处理

If a search returns hundreds of matches (applies to both
rg
and
sg
):
  1. Add file type/language filters:
    --type py
    (rg) or
    --lang python
    (sg)
  2. Narrow directory scope: Search
    src/
    instead of
    .
  3. Make pattern more specific: Add context around the pattern
  4. Use word boundaries:
    -w
    flag for whole words only (rg)
  5. Pipe to head: Limit output with
    | head -50
  6. Exclude test files:
    --glob '!*test*'
    (rg) or avoid test directories (sg)
Example of refinement:
bash
undefined
如果搜索返回数百个匹配项(适用于
rg
sg
):
  1. 添加文件类型/语言筛选:使用
    --type py
    (rg)或
    --lang python
    (sg)
  2. 缩小目录范围:搜索
    src/
    而非
    .
  3. 让模式更具体:在模式周围添加上下文
  4. 使用单词边界:使用
    -w
    标志仅匹配完整单词(rg)
  5. 通过head限制输出:使用
    | head -50
  6. 排除测试文件:使用
    --glob '!*test*'
    (rg)或避开测试目录(sg)
细化示例:
bash
undefined

Step 1: Too broad (10,000+ matches)

步骤1:范围过宽(10000+匹配项)

rg "error"
rg "error"

Step 2: Add file type (1,000 matches)

步骤2:添加文件类型(1000匹配项)

rg "error" --type py
rg "error" --type py

Step 3: Add directory scope (200 matches)

步骤3:添加目录范围(200匹配项)

rg "error" --type py src/
rg "error" --type py src/

Step 4: Make pattern specific (20 matches)

步骤4:让模式更具体(20匹配项)

rg "raise.*Error" --type py src/
rg "raise.*Error" --type py src/

Step 5: Target exact location (5 matches)

步骤5:定位确切位置(5匹配项)

rg "raise.*Error" --type py src/services/
undefined
rg "raise.*Error" --type py src/services/
undefined

How to Use

使用方法

Ripgrep (rg)

Ripgrep (rg)

bash
undefined
bash
undefined

Basic text search

基础文本搜索

rg "pattern" --type py --type js
rg "pattern" --type py --type js

Common flags

常用标志

-i # Case-insensitive -w # Match whole words only -l # Show only filenames -n # Show line numbers -C 3 # Show 3 lines of context --count # Count matches per file --glob '!dir' # Exclude directory
-i # 不区分大小写 -w # 仅匹配完整单词 -l # 仅显示文件名 -n # 显示行号 -C 3 # 显示3行上下文 --count # 按文件统计匹配数量 --glob '!dir' # 排除目录

Examples

示例

rg "function.*login" --type js src/ rg -i "TODO" --count rg "auth|login|session" --type py
undefined
rg "function.*login" --type js src/ rg -i "TODO" --count rg "auth|login|session" --type py
undefined

AST-Grep (sg)

AST-Grep (sg)

bash
undefined
bash
undefined

Structural code search

结构化代码搜索

sg --pattern 'function $NAME($$$) { $$$ }' --lang js
sg --pattern 'function $NAME($$$) { $$$ }' --lang js

Metavariables

元变量

$VAR # Matches single AST node $$$ # Matches zero or more nodes
$VAR # 匹配单个AST节点 $$$ # 匹配零个或多个节点

Examples

示例

sg --pattern 'class $NAME { $$$ }' --lang python sg --pattern 'import $X from $Y' --lang js sg --pattern 'async function $NAME($$$) { $$$ }' src/
undefined
sg --pattern 'class $NAME { $$$ }' --lang python sg --pattern 'import $X from $Y' --lang js sg --pattern 'async function $NAME($$$) { $$$ }' src/
undefined

Common Search Patterns

常见搜索模式

bash
undefined
bash
undefined

Security issues

安全问题

rg -i "password\s*=\s*['"]" --type py rg "\beval(" --type js
rg -i "password\s*=\s*['"]" --type py rg "\beval(" --type js

TODOs and comments

TODO和注释

rg "TODO|FIXME|HACK"
rg "TODO|FIXME|HACK"

Code structures

代码结构

sg --pattern 'class $NAME { $$$ }' --lang python sg --pattern 'try { $$$ } catch ($E) { $$$ }' --lang js
sg --pattern 'class $NAME { $$$ }' --lang python sg --pattern 'try { $$$ } catch ($E) { $$$ }' --lang js

Dependencies

依赖关系

rg "from requests import" --type py rg "require(['"]" --type js
rg "from requests import" --type py rg "require(['"]" --type js

Refactoring

重构

rg ".old_method(" --type py rg "@deprecated" -A 5
undefined
rg ".old_method(" --type py rg "@deprecated" -A 5
undefined

File Type Filters

文件类型筛选

Common ripgrep file types:
py
,
js
,
ts
,
rust
,
go
,
java
,
c
,
cpp
,
html
,
css
,
json
,
yaml
,
md
Use
--type-list
to see all available types, or define custom types:
bash
rg --type-add 'config:*.{yml,yaml,toml,ini}' --type config "pattern"
常见的ripgrep文件类型:
py
,
js
,
ts
,
rust
,
go
,
java
,
c
,
cpp
,
html
,
css
,
json
,
yaml
,
md
使用
--type-list
查看所有可用类型,或定义自定义类型:
bash
rg --type-add 'config:*.{yml,yaml,toml,ini}' --type config "pattern"

Performance Tips

性能提示

See "Targeting Your Searches" section for comprehensive strategies. Key tips:
bash
undefined
请查看“精准定位搜索范围”部分获取全面策略。关键提示:
bash
undefined

Limit scope to specific directories

限制范围到特定目录

rg "pattern" src/
rg "pattern" src/

Filter by file type

按文件类型筛选

rg "pattern" --type py --type js
rg "pattern" --type py --type js

Exclude large directories

排除大型目录

rg "pattern" --glob '!{node_modules,venv,.git}'
rg "pattern" --glob '!{node_modules,venv,.git}'

Use fixed strings (no regex) for speed

使用固定字符串(无正则)提升速度

rg -F "exact string"
rg -F "exact string"

Count before viewing full results

在查看完整结果前先统计数量

rg "pattern" --count --type py
undefined
rg "pattern" --count --type py
undefined

Best Practices

最佳实践

  1. Start Narrow, Then Broaden: Use specific patterns, file types, and directory scope from the start
  2. Count Before Viewing: Use
    --count
    or
    | head -N
    to preview result volume
  3. Always Specify File Types: Use
    --type
    (rg) or
    --lang
    (sg) to filter by language
  4. Exclude Common Noise: Add
    --glob '!{node_modules,venv,.git,dist,build}'
    habitually
  5. Combine Tools: Use
    rg
    for text patterns,
    sg
    for structural code patterns
  6. Use Context Strategically: Add
    -C N
    for surrounding lines, but be mindful of output volume
  1. 从窄范围开始,再逐步扩大:从一开始就使用特定模式、文件类型和目录范围
  2. 先统计再查看:使用
    --count
    | head -N
    预览结果数量
  3. 始终指定文件类型:使用
    --type
    (rg)或
    --lang
    (sg)按语言筛选
  4. 排除常见干扰项:习惯性添加
    --glob '!{node_modules,venv,.git,dist,build}'
  5. 组合使用工具:使用
    rg
    搜索文本模式,使用
    sg
    搜索结构化代码模式
  6. 策略性使用上下文:添加
    -C N
    查看周围行,但注意输出量

Troubleshooting

故障排除

  • No matches found: Check file type filters, try
    -i
    for case-insensitive, search partial pattern first
  • Too slow: Exclude directories with
    --glob
    , limit file types with
    --type
    , narrow search path
  • AST-grep issues: Verify
    --lang
    is correct, try simpler pattern, use
    rg
    to verify code exists
  • 未找到匹配项:检查文件类型筛选,尝试使用
    -i
    不区分大小写,先搜索部分模式
  • 速度过慢:使用
    --glob
    排除目录,使用
    --type
    筛选文件类型,缩小搜索路径
  • AST-grep问题:验证
    --lang
    是否正确,尝试更简单的模式,使用
    rg
    确认代码存在

Resources

资源