jcodemunch-mcp-code-retrieval

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

jCodeMunch MCP - Structured Code Retrieval

jCodeMunch MCP - 结构化代码检索

Skill by ara.so — MCP Skills collection.
jCodeMunch is an MCP server that indexes codebases using tree-sitter AST parsing and enables structured retrieval of code symbols (functions, classes, methods, constants) with byte-level precision. It cuts code-reading token usage by 95%+ by letting agents retrieve exact implementations instead of reading entire files.
ara.so开发的Skill — 属于MCP Skills集合。
jCodeMunch是一款MCP服务器,它使用tree-sitter AST解析对代码库建立索引,并支持对代码符号(函数、类、方法、常量)进行字节级精度的结构化检索。通过让Agent检索精确的实现代码而非读取整个文件,它能将代码阅读的Token消耗降低95%以上。

Core Concept

核心概念

Traditional approach: open files → scan thousands of lines → repeat (token incinerator).
jCodeMunch approach: index once → query cheaply → retrieve exact symbols (95%+ token savings).
传统方式:打开文件 → 扫描数千行代码 → 重复操作(Token消耗大户)。
jCodeMunch方式:一次性建立索引 → 低成本查询 → 检索精确符号(Token节省95%以上)。

Installation

安装

Quick Install (VS Code, Cursor, Claude Code)

快速安装(VS Code、Cursor、Claude Code)

VS Code:
bash
undefined
VS Code:
bash
undefined

One-click via badge or manual:

通过一键徽章或手动安装:

uvx jcodemunch-mcp

**Cursor:**
Use the one-click install badge or add to MCP settings:
```json
{
  "mcpServers": {
    "jcodemunch": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    }
  }
}
Claude Code (CLI):
bash
undefined
uvx jcodemunch-mcp

**Cursor:**
使用一键安装徽章,或添加到MCP设置中:
```json
{
  "mcpServers": {
    "jcodemunch": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    }
  }
}
Claude Code(CLI):
bash
undefined

Install via uvx

通过uvx安装

uvx jcodemunch-mcp
uvx jcodemunch-mcp

Or use the jcm CLI helper

或使用jcm CLI助手

jcm install claude-code
undefined
jcm install claude-code
undefined

Universal Install (Any MCP-compatible client)

通用安装(兼容任何MCP客户端)

Add to your MCP configuration file:
json
{
  "mcpServers": {
    "jcodemunch": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    }
  }
}
添加到你的MCP配置文件中:
json
{
  "mcpServers": {
    "jcodemunch": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    }
  }
}

CLI Installation

CLI安装

bash
undefined
bash
undefined

Install the jcm CLI

安装jcm CLI

pip install jcodemunch-mcp
pip install jcodemunch-mcp

Or use uvx

或使用uvx

uvx jcodemunch-mcp
undefined
uvx jcodemunch-mcp
undefined

Configuration

配置

Create
.jcodemunch/config.jsonc
in your project root:
jsonc
{
  // Index settings
  "index_path": ".jcodemunch/index",
  "excluded_patterns": [
    "node_modules/**",
    "venv/**",
    ".git/**",
    "*.pyc",
    "__pycache__/**"
  ],
  
  // Token budget defaults
  "default_token_budget": 8000,
  "max_token_budget": 16000,
  
  // Compact format (MUNCH) - saves 45.5% tokens on average
  "compact_format_enabled": true,
  "compact_format_threshold": 0.15, // Use compact if ≥15% savings
  
  // Semantic search (optional, requires sentence-transformers)
  "semantic_search_enabled": false,
  "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
  
  // Tool configuration
  "disabled_tools": [],  // Disable specific tools if needed
  
  // Language support
  "supported_languages": [
    "python", "javascript", "typescript", "go", "rust",
    "java", "c", "cpp", "c_sharp", "ruby", "php"
  ]
}
在项目根目录创建
.jcodemunch/config.jsonc
jsonc
{
  // 索引设置
  "index_path": ".jcodemunch/index",
  "excluded_patterns": [
    "node_modules/**",
    "venv/**",
    ".git/**",
    "*.pyc",
    "__pycache__/**"
  ],
  
  // Token预算默认值
  "default_token_budget": 8000,
  "max_token_budget": 16000,
  
  // 紧凑格式(MUNCH)- 平均节省45.5%的Token
  "compact_format_enabled": true,
  "compact_format_threshold": 0.15, // 当Token节省≥15%时使用紧凑格式
  
  // 语义搜索(可选,需要sentence-transformers)
  "semantic_search_enabled": false,
  "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
  
  // 工具配置
  "disabled_tools": [],  // 如有需要,禁用特定工具
  
  // 语言支持
  "supported_languages": [
    "python", "javascript", "typescript", "go", "rust",
    "java", "c", "cpp", "c_sharp", "ruby", "php"
  ]
}

Core Tools & Usage

核心工具与用法

1. Indexing

1. 索引

Index a repository:
python
undefined
为仓库建立索引:
python
undefined

Via MCP tool call

通过MCP工具调用

index_repository( path="/path/to/repo", force_reindex=False # Set True to rebuild from scratch )

**Check index status:**
```python
get_index_info()
index_repository( path="/path/to/repo", force_reindex=False # 设置为True可从头重建索引 )

**检查索引状态:**
```python
get_index_info()

Returns: stats, file count, symbol count, last indexed time

返回:统计信息、文件数量、符号数量、上次索引时间

undefined
undefined

2. Symbol Search

2. 符号搜索

Find symbols by name (BM25 + fuzzy matching):
python
find_symbols(
    query="get_user",
    limit=10,
    file_pattern="*.py",  # Optional filter
    format="auto"  # auto|compact|json
)
Find implementations (multi-source resolution):
python
find_implementations(
    symbol="UserService.authenticate",
    include_lsp=True,  # LSP dispatch
    include_hierarchy=True,  # Class hierarchy
    include_duck_typed=True,  # Duck-typed matches
    include_decorators=True  # Decorator handlers
)
按名称查找符号(BM25 + 模糊匹配):
python
find_symbols(
    query="get_user",
    limit=10,
    file_pattern="*.py",  # 可选过滤条件
    format="auto"  # auto|compact|json
)
查找实现(多源解析):
python
find_implementations(
    symbol="UserService.authenticate",
    include_lsp=True,  # LSP调度
    include_hierarchy=True,  # 类层级
    include_duck_typed=True,  # 鸭子类型匹配
    include_decorators=True  # 装饰器处理器
)

3. Context Retrieval

3. 上下文检索

Get exact function/class implementation:
python
get_symbol_content(
    identifier="UserService.authenticate",
    include_docstring=True,
    include_decorators=True,
    format="auto"
)
Get context bundle with token budget:
python
get_ranked_context(
    query="authentication logic",
    token_budget=4000,
    include_imports=True,
    include_references=True,
    format="auto"
)
获取精确的函数/类实现:
python
get_symbol_content(
    identifier="UserService.authenticate",
    include_docstring=True,
    include_decorators=True,
    format="auto"
)
获取指定Token预算内的上下文包:
python
get_ranked_context(
    query="authentication logic",
    token_budget=4000,
    include_imports=True,
    include_references=True,
    format="auto"
)

Returns: ranked symbols within budget, total tokens used

返回:预算内的排序符号列表、总Token使用量


**Task-aware context orchestration:**
```python
assemble_task_context(
    task="fix bug in user authentication where tokens expire too early",
    token_budget=8000,
    session_id="fix-auth-bug-001"  # Optional session tracking
)

**任务感知型上下文编排:**
```python
assemble_task_context(
    task="修复用户认证中令牌过期过早的bug",
    token_budget=8000,
    session_id="fix-auth-bug-001"  # 可选会话跟踪
)

Auto-classifies intent (bug_fix, feature, refactor, etc.)

自动分类意图(bug修复、功能开发、重构等)

Extracts anchor symbols from task description

从任务描述中提取锚定符号

Runs appropriate tool sequence under budget

在预算内运行合适的工具序列

undefined
undefined

4. Structural Queries

4. 结构化查询

Find references to a symbol:
python
find_references(
    identifier="get_user",
    include_calls=True,
    include_imports=True,
    format="auto"
)
Find who imports a module/symbol:
python
find_importers(
    target="services.auth",
    format="auto"
)
Get blast radius (impact analysis):
python
get_blast_radius(
    identifier="User.email",
    max_depth=3,
    include_source=True,  # Include source snippets
    format="auto"
)
Class hierarchy:
python
get_class_hierarchy(
    class_name="BaseModel",
    direction="both",  # up|down|both
    format="auto"
)
查找符号的引用:
python
find_references(
    identifier="get_user",
    include_calls=True,
    include_imports=True,
    format="auto"
)
查找谁导入了某个模块/符号:
python
find_importers(
    target="services.auth",
    format="auto"
)
获取影响范围(影响分析):
python
get_blast_radius(
    identifier="User.email",
    max_depth=3,
    include_source=True,  # 包含代码片段
    format="auto"
)
类层级:
python
get_class_hierarchy(
    class_name="BaseModel",
    direction="both",  # up|down|both
    format="auto"
)

5. Code Quality & Refactoring

5. 代码质量与重构

Find dead code:
python
find_dead_code(
    scope="all",  # all|file|module
    include_private=True,
    format="auto"
)
Find untested symbols:
python
get_untested_symbols(
    scope="all",
    min_complexity=5,  # Focus on complex code
    format="auto"
)
Find similar/duplicate code:
python
find_similar_symbols(
    threshold=0.8,  # Similarity threshold (0-1)
    min_cluster_size=2,
    use_semantic=True,  # Requires semantic search enabled
    use_structural=True,
    use_behavioral=True,  # Callee Jaccard
    format="auto"
)
查找死代码:
python
find_dead_code(
    scope="all",  # all|file|module
    include_private=True,
    format="auto"
)
查找未测试的符号:
python
get_untested_symbols(
    scope="all",
    min_complexity=5,  # 聚焦复杂代码
    format="auto"
)
查找相似/重复代码:
python
find_similar_symbols(
    threshold=0.8,  # 相似度阈值(0-1)
    min_cluster_size=2,
    use_semantic=True,  # 需要启用语义搜索
    use_structural=True,
    use_behavioral=True,  # 被调用方杰卡德系数
    format="auto"
)

Returns: clusters with canonical pick + consolidation verdict

返回:包含标准示例和合并建议的代码簇


**Check if safe to delete:**
```python
check_delete_safe(
    identifier="legacy_auth_handler",
    format="auto"
)

**检查是否可以安全删除:**
```python
check_delete_safe(
    identifier="legacy_auth_handler",
    format="auto"
)

Returns: composite verdict + ranked blockers + recommended action

返回:综合判断结果、排序后的阻塞因素、推荐操作

undefined
undefined

6. Architectural Analysis

6. 架构分析

Get symbol importance (PageRank):
python
get_symbol_importance(
    limit=20,
    scope="all",  # all|file|module
    format="auto"
)
Repository overview map (cold-start orientation):
python
get_repo_map(
    token_budget=4000,
    signature_only=True,  # Just signatures, not full implementations
    format="auto"
)
获取符号重要性(PageRank算法):
python
get_symbol_importance(
    limit=20,
    scope="all",  # all|file|module
    format="auto"
)
仓库概览图(快速上手定位):
python
get_repo_map(
    token_budget=4000,
    signature_only=True,  # 仅显示签名,不显示完整实现
    format="auto"
)

Returns: PageRank-ranked symbol overview within budget

返回:预算内基于PageRank排序的符号概览


**Dependency cycles:**
```python
get_dependency_cycles(
    format="auto"
)
Hotspot detection (complexity × churn):
python
get_hotspots(
    min_complexity=10,
    days_back=90,
    format="auto"
)

**依赖循环:**
```python
get_dependency_cycles(
    format="auto"
)
热点检测(复杂度 × 变更频率):
python
get_hotspots(
    min_complexity=10,
    days_back=90,
    format="auto"
)

7. Multi-Repo Operations

7. 多仓库操作

Cross-repo API contracts:
python
get_group_contracts(
    repos=["/path/to/repo1", "/path/to/repo2"],
    min_shared=2,  # Minimum repos sharing symbol
    format="auto"
)
跨仓库API契约:
python
get_group_contracts(
    repos=["/path/to/repo1", "/path/to/repo2"],
    min_shared=2,  # 共享符号的最小仓库数量
    format="auto"
)

Returns: ranked contracts classified as:

返回:排序后的契约,分为以下类别:

- de_facto_api (stable, high usage)

- de_facto_api(稳定、高使用率)

- leaky_internal (unintended exposure)

- leaky_internal(意外暴露)

- dead_contract (no runtime hits)

- dead_contract(无运行时调用)

- version_skew (inconsistent across repos)

- version_skew(仓库间不一致)

undefined
undefined

8. Git Integration

8. Git集成

Get changed symbols from git diff:
python
get_changed_symbols(
    base_ref="main",
    head_ref="feature-branch",
    format="auto"
)
从Git diff中获取变更的符号:
python
get_changed_symbols(
    base_ref="main",
    head_ref="feature-branch",
    format="auto"
)

Real-World Workflows

实际工作流

Workflow 1: Fix a Bug

工作流1:修复Bug

python
undefined
python
undefined

1. Get task-oriented context

1. 获取面向任务的上下文

assemble_task_context( task="fix NullPointerException in payment processing", token_budget=6000, session_id="fix-payment-bug" )
assemble_task_context( task="修复支付处理中的NullPointerException", token_budget=6000, session_id="fix-payment-bug" )

2. Find related code

2. 查找相关代码

find_symbols( query="payment process", limit=5 )
find_symbols( query="payment process", limit=5 )

3. Get implementation + references

3. 获取实现代码 + 引用

get_symbol_content(identifier="PaymentService.process") find_references(identifier="PaymentService.process")
get_symbol_content(identifier="PaymentService.process") find_references(identifier="PaymentService.process")

4. Check blast radius before fix

4. 修复前检查影响范围

get_blast_radius( identifier="PaymentService.process", max_depth=2, include_source=True )
undefined
get_blast_radius( identifier="PaymentService.process", max_depth=2, include_source=True )
undefined

Workflow 2: Refactor Dead Code

工作流2:重构死代码

python
undefined
python
undefined

1. Find dead code

1. 查找死代码

dead_symbols = find_dead_code(scope="all", include_private=True)
dead_symbols = find_dead_code(scope="all", include_private=True)

2. For each dead symbol, verify safety

2. 对每个死代码符号,验证是否可以安全删除

check_delete_safe(identifier="legacy_user_handler")
check_delete_safe(identifier="legacy_user_handler")

3. Get blast radius to confirm

3. 获取影响范围以确认

get_blast_radius(identifier="legacy_user_handler", max_depth=1)
get_blast_radius(identifier="legacy_user_handler", max_depth=1)

4. Find similar code that might be consolidated

4. 查找可合并的相似代码

find_similar_symbols(threshold=0.85, min_cluster_size=2)
undefined
find_similar_symbols(threshold=0.85, min_cluster_size=2)
undefined

Workflow 3: Understand New Codebase

工作流3:理解新代码库

python
undefined
python
undefined

1. Get high-level overview

1. 获取高层概览

get_repo_map(token_budget=3000, signature_only=True)
get_repo_map(token_budget=3000, signature_only=True)

2. Find most important symbols

2. 查找最重要的符号

get_symbol_importance(limit=15)
get_symbol_importance(limit=15)

3. Explore class hierarchies

3. 探索类层级

get_class_hierarchy(class_name="BaseController", direction="down")
get_class_hierarchy(class_name="BaseController", direction="down")

4. Get architectural overview

4. 获取架构概览

get_dependency_cycles() get_hotspots(min_complexity=8, days_back=30)
undefined
get_dependency_cycles() get_hotspots(min_complexity=8, days_back=30)
undefined

Workflow 4: Feature Implementation

工作流4:功能实现

python
undefined
python
undefined

1. Find where similar features are implemented

1. 查找类似功能的实现位置

find_symbols(query="user authentication", limit=10)
find_symbols(query="user authentication", limit=10)

2. Get ranked context for the task

2. 获取任务的排序上下文

assemble_task_context( task="add OAuth2 authentication alongside existing password auth", token_budget=8000 )
assemble_task_context( task="在现有密码认证基础上添加OAuth2认证", token_budget=8000 )

3. Find all authentication-related symbols

3. 获取所有认证相关符号

get_ranked_context( query="authentication oauth password", token_budget=5000, include_imports=True )
get_ranked_context( query="authentication oauth password", token_budget=5000, include_imports=True )

4. Check who will be affected

4. 检查受影响的对象

find_importers(target="auth.handlers")
undefined
find_importers(target="auth.handlers")
undefined

Compact Format (MUNCH)

紧凑格式(MUNCH)

All tools accept
format
parameter:
  • auto
    - Use compact if ≥15% savings, else JSON (default)
  • compact
    - Always use compact format (45.5% avg token savings)
  • json
    - Always use JSON (backwards compatible)
Example savings on
get_blast_radius
:
  • JSON: 3,850 tokens
  • Compact: 700 tokens (5.5× reduction)
所有工具都支持
format
参数:
  • auto
    - 当Token节省≥15%时使用紧凑格式,否则使用JSON(默认)
  • compact
    - 始终使用紧凑格式(平均节省45.5%的Token)
  • json
    - 始终使用JSON格式(向后兼容)
get_blast_radius
的Token节省示例:
  • JSON:3,850个Token
  • 紧凑格式:700个Token(减少5.5倍)

Advanced Features

高级功能

Semantic Search (Optional)

语义搜索(可选)

Enable in config:
jsonc
{
  "semantic_search_enabled": true,
  "embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
}
Install dependencies:
bash
pip install sentence-transformers torch
Use hybrid search:
python
find_symbols(
    query="handles user authentication with tokens",
    use_semantic=True,  # Combines BM25 + embeddings
    limit=10
)
在配置中启用:
jsonc
{
  "semantic_search_enabled": true,
  "embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
}
安装依赖:
bash
pip install sentence-transformers torch
使用混合搜索:
python
find_symbols(
    query="handles user authentication with tokens",
    use_semantic=True,  # 结合BM25与嵌入向量
    limit=10
)

Custom Context Providers

自定义上下文提供者

Add dbt or Git context:
python
undefined
添加dbt或Git上下文:
python
undefined

dbt models as context

dbt模型作为上下文

get_dbt_context( model_name="fct_orders", include_upstream=True, include_downstream=True )
get_dbt_context( model_name="fct_orders", include_upstream=True, include_downstream=True )

Git context

Git上下文

get_changed_symbols(base_ref="main", head_ref="HEAD")
undefined
get_changed_symbols(base_ref="main", head_ref="HEAD")
undefined

Session Management

会话管理

Track multi-turn context:
python
plan_turn(
    session_id="feature-impl-001",
    task="implement OAuth2",
    budget=10000
)

assemble_task_context(
    task="add Google OAuth provider",
    session_id="feature-impl-001",
    token_budget=6000
)
跟踪多轮对话上下文:
python
plan_turn(
    session_id="feature-impl-001",
    task="implement OAuth2",
    budget=10000
)

assemble_task_context(
    task="add Google OAuth provider",
    session_id="feature-impl-001",
    token_budget=6000
)

Language Support

语言支持

Fully supported via tree-sitter:
  • Python, JavaScript, TypeScript, Go, Rust
  • Java, C, C++, C#, Ruby, PHP
  • And more (see
    LANGUAGE_SUPPORT.md
    )
通过tree-sitter完全支持:
  • Python、JavaScript、TypeScript、Go、Rust
  • Java、C、C++、C#、Ruby、PHP
  • 更多语言(详见
    LANGUAGE_SUPPORT.md

Troubleshooting

故障排除

Index not found

未找到索引

python
undefined
python
undefined

Force reindex

强制重新索引

index_repository(path=".", force_reindex=True)
undefined
index_repository(path=".", force_reindex=True)
undefined

Semantic search not working

语义搜索无法工作

bash
undefined
bash
undefined

Install required dependencies

安装所需依赖

pip install sentence-transformers torch
undefined
pip install sentence-transformers torch
undefined

Too many results

结果过多

python
undefined
python
undefined

Use stricter filters

使用更严格的过滤条件

find_symbols( query="handler", file_pattern="**/controllers/*.py", limit=5 )
undefined
find_symbols( query="handler", file_pattern="**/controllers/*.py", limit=5 )
undefined

Token budget exceeded

Token预算超出

python
undefined
python
undefined

Use smaller budget or signature-only mode

使用更小的预算或仅签名模式

get_repo_map(token_budget=2000, signature_only=True)
undefined
get_repo_map(token_budget=2000, signature_only=True)
undefined

Performance issues

性能问题

jsonc
// Add to config.jsonc
{
  "excluded_patterns": [
    "node_modules/**",
    "venv/**",
    "dist/**",
    "build/**",
    ".git/**"
  ]
}
jsonc
// 添加到config.jsonc
{
  "excluded_patterns": [
    "node_modules/**",
    "venv/**",
    "dist/**",
    "build/**",
    ".git/**"
  ]
}

CLI Commands (jcm)

CLI命令(jcm)

bash
undefined
bash
undefined

Install client-specific config

安装客户端特定配置

jcm install claude-code jcm install cursor jcm install windsurf
jcm install claude-code jcm install cursor jcm install windsurf

Index current directory

为当前目录建立索引

jcm index .
jcm index .

Search symbols

搜索符号

jcm search "UserService"
jcm search "UserService"

Get symbol info

获取符号信息

jcm info "UserService.authenticate"
jcm info "UserService.authenticate"

Check config

检查配置

jcm config --validate
undefined
jcm config --validate
undefined

Best Practices

最佳实践

  1. Index early: Run
    index_repository()
    before starting work
  2. Use token budgets: Always specify budget for context retrieval
  3. Enable compact format: Set
    format="auto"
    for 45%+ token savings
  4. Filter aggressively: Use
    file_pattern
    and
    scope
    to narrow results
  5. Use task context:
    assemble_task_context()
    auto-orchestrates the right tools
  6. Check blast radius: Before refactoring, verify impact with
    get_blast_radius()
  7. Exclude build artifacts: Add node_modules, dist, venv to
    excluded_patterns
  8. Use semantic search: For natural language queries, enable embeddings
  1. 尽早建立索引:开始工作前运行
    index_repository()
  2. 使用Token预算:上下文检索时始终指定预算
  3. 启用紧凑格式:设置
    format="auto"
    以节省45%以上的Token
  4. 积极过滤结果:使用
    file_pattern
    scope
    缩小结果范围
  5. 使用任务上下文
    assemble_task_context()
    会自动编排合适的工具
  6. 检查影响范围:重构前使用
    get_blast_radius()
    验证影响
  7. 排除构建产物:将node_modules、dist、venv添加到
    excluded_patterns
  8. 使用语义搜索:对于自然语言查询,启用嵌入向量功能

Commercial Use

商业使用

  • Free: Non-commercial use
  • Builder ($79): 1 developer
  • Studio ($349): Up to 5 developers
  • Platform ($1,999): Organization-wide
  • 免费版:非商业用途
  • Builder版(79美元):1名开发者
  • Studio版(349美元):最多5名开发者
  • Platform版(1,999美元):全组织使用