mcp-evaluator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

MCP Server Security Evaluator

MCP服务器安全评估工具

Overview

概述

Automatically evaluate the security, privacy, and reliability of MCP (Model Context Protocol) servers from GitHub repositories. This skill performs comprehensive assessments including code analysis, community feedback research, security vulnerability detection, and risk scoring to provide actionable recommendations.

自动评估GitHub仓库中MCP（Model Context Protocol）服务器的安全性、隐私性和可靠性。该工具会执行全面评估，包括代码分析、社区反馈调研、安全漏洞检测和风险评分，以提供可执行的建议。

When to Use This Skill

适用场景

Use this skill when users:

Provide a GitHub URL to an MCP server repository
Ask "is this MCP server safe?"
Request security assessment of an MCP server
Want to evaluate privacy risks before installing an MCP server
Need to compare MCP servers with similar functionality
Ask about community feedback or reviews of an MCP server

当用户有以下需求时使用本工具：

提供MCP服务器仓库的GitHub URL
询问“这个MCP服务器是否安全？”
请求对MCP服务器进行安全评估
想要在安装MCP服务器前评估隐私风险
需要对比功能类似的MCP服务器
询问MCP服务器的社区反馈或评价

Tool Strategy

工具策略

This skill works with or without MCP servers through a graceful degradation approach:

For GitHub repositories:

Priority: GitHub MCP (if available) for direct repository API access
Alternatives: Bright Data MCP (The Web MCP) or built-in web tools for scraping
Optional: Sequential Thinking MCP for systematic analysis (recommended but not required)
Fallback: Claude's built-in web search when no MCP servers available

For web search and community validation:

Priority: Bright Data MCP or Brave Search MCP for web search and content fetching
Fallback: Claude's built-in web search

本工具通过优雅降级的方式，可在有或没有MCP服务器的情况下工作：

针对GitHub仓库：

优先选择：GitHub MCP（若可用），用于直接访问仓库API
替代方案：Bright Data MCP（Web MCP）或内置网络工具用于爬取数据
可选：Sequential Thinking MCP用于系统化分析（推荐但非必需）
** fallback方案**：当无MCP服务器可用时，使用Claude的内置网络搜索

针对网络搜索和社区验证：

优先选择：Bright Data MCP或Brave Search MCP用于网络搜索和内容获取
** fallback方案**：使用Claude的内置网络搜索

Evaluation Workflow

评估流程

Step 1: Initial Setup

步骤1：初始设置

Ask the user their preferred output format:

Markdown (.md) - default
PDF (.pdf) - requires conversion after markdown creation

Acknowledge receipt and inform user that evaluation is beginning. Parse the GitHub URL to extract owner and repository name.

询问用户偏好的输出格式：

Markdown（.md）- 默认格式
PDF（.pdf）- 需要先创建Markdown再转换

确认收到请求并告知用户评估已开始。解析GitHub URL以提取仓库所有者和仓库名称。

Step 2: Tool Assessment

步骤2：工具评估

Check which tools are available and plan the evaluation approach:

If GitHub MCP available: use for repository access (preferred for GitHub repos)
If Bright Data MCP available: use for web scraping and searching (or as GitHub alternative)
If neither available: use Claude's built-in capabilities with noted limitations

检查可用工具并规划评估方法：

若GitHub MCP可用：用于仓库访问（GitHub仓库的首选方式）
若Bright Data MCP可用：用于网络爬取和搜索（或作为GitHub的替代方案）
若两者都不可用：使用Claude的内置功能，并注明局限性

Step 3: Create Assessment File

步骤3：创建评估文件

Use built-in

create_file

tool to create assessment file in

/mnt/user-data/outputs/

File naming:

MCP_Security_Assessment_{owner}_{repo_name}.md

Update iteratively throughout evaluation process

使用内置的

create_file

工具在

/mnt/user-data/outputs/

目录下创建评估文件：

文件命名：

MCP_Security_Assessment_{owner}_{repo_name}.md

在整个评估过程中迭代更新文件

Step 4: Repository Content Access

步骤4：仓库内容访问

With GitHub MCP (Priority):

Use GitHub MCP tools to directly access:
- Repository metadata and statistics
- File contents: README.md, package.json, LICENSE, source files
- Commit history via
```
list_commits
```
  for activity analysis
- Repository tree/structure
Use
```
search_repositories
```
for similar MCP servers

With Bright Data MCP (Alternative):

Use

scrape_as_markdown

to retrieve:

Repository main page:
```
https://github.com/{owner}/{repo}
```

README:

https://github.com/{owner}/{repo}/blob/main/README.md

Raw files:

https://raw.githubusercontent.com/{owner}/{repo}/main/{filepath}

Key code files: package.json, index.js, src files, etc.

Fallback Without MCP:

Use Claude's built-in web tools for available information
Note limitations in assessment
Request user provide critical files if needed

Document each file examined with code snippets of important sections.

使用GitHub MCP（优先）：

使用GitHub MCP工具直接访问：
- 仓库元数据和统计信息
- 文件内容：README.md、package.json、LICENSE、源文件
- 通过
```
list_commits
```
  获取提交历史以进行活跃度分析
- 仓库目录/结构
使用
```
search_repositories
```
查找类似的MCP服务器

使用Bright Data MCP（替代）：

使用

scrape_as_markdown

获取：

仓库主页：
```
https://github.com/{owner}/{repo}
```

README：

https://github.com/{owner}/{repo}/blob/main/README.md

原始文件：

https://raw.githubusercontent.com/{owner}/{repo}/main/{filepath}

关键代码文件：package.json、index.js、src文件等

无MCP时的Fallback方案：

使用Claude的内置网络工具获取可用信息
在评估中注明局限性
若需要，请求用户提供关键文件

记录每个检查的文件，并附上重要部分的代码片段。

Step 5: Sequential Evaluation

步骤5：顺序评估

Execute evaluation in this order, updating assessment file after each step:

按以下顺序执行评估，每完成一步就更新评估文件：

5.1 Repository Setup & Metadata

5.1 仓库设置与元数据

Extract repository statistics (stars, forks, contributors, activity)
Analyze commit history and frequency
Review contributor diversity and patterns
Check for security policies and contribution guidelines
Document findings in "GitHub Repository Assessment" section

提取仓库统计信息（星标数、复刻数、贡献者、活跃度）
分析提交历史和频率
审查贡献者的多样性和模式
检查安全政策和贡献指南
在“GitHub仓库评估”部分记录发现

5.2 Purpose & Functionality Analysis

5.2 用途与功能分析

Review README and documentation thoroughly
Identify stated purpose and capabilities
List external services/APIs the server connects to
Note required permissions and access levels
Identify creator/maintainer background
Create "Server Purpose" and "Expected Functionality" sections

仔细审阅README和文档
确定声明的用途和功能
列出服务器连接的外部服务/API
记录所需的权限和访问级别
确定创建者/维护者背景
创建“服务器用途”和“预期功能”部分

5.3 Alternatives Analysis

5.3 替代方案分析

Search for alternative MCP servers with similar functionality:

Use web search: "{functionality} MCP server"
Check MCP directories: Smithery, Glama, PulseMCP, MCP.so
Review repository forks for improved versions
Document 2-3 alternatives minimum with comparisons
Create "Alternative MCP Servers" section

搜索功能类似的其他MCP服务器：

使用网络搜索：“{functionality} MCP server”
检查MCP目录：Smithery、Glama、PulseMCP、MCP.so
查看仓库复刻版本以寻找改进版
至少记录2-3个替代方案并进行对比
创建“替代MCP服务器”部分

5.4 Code Review

5.4 代码审查

Analyze codebase for:

Authentication mechanisms and credential handling
Data collection, storage, and transmission practices
Security practices (input validation, encryption, sanitization)
Suspicious or unexpected behaviors
Code quality and error handling

Reference the security patterns documentation: Review

references/mcp_security_patterns.md

to identify known vulnerability patterns, and

references/safe_mcp_examples.md

to avoid false positives from legitimate patterns.

Be specific: Include actual code snippets as evidence. Categorize findings by severity (Critical, High, Medium, Low). Focus on concrete vulnerabilities, not generic statements.

Document in "Code Analysis" section.

分析代码库以检查：

认证机制和凭证处理
数据收集、存储和传输实践
安全实践（输入验证、加密、清理）
可疑或意外行为
代码质量和错误处理

参考安全模式文档：查看

references/mcp_security_patterns.md

以识别已知漏洞模式，查看

references/safe_mcp_examples.md

以避免将合法模式误判为风险。

具体化：包含实际代码片段作为证据。按严重程度（关键、高、中、低）对发现进行分类。专注于具体漏洞，而非泛泛而谈。

在“代码分析”部分记录。

5.5 Community Validation

5.5 社区验证

Perform specific web searches using Bright Data MCP or web search:

Reddit: "{owner} {repo_name} MCP"
Twitter/X: "{owner} {repo_name} MCP"
MCP Directories: "smithery.ai {repo_name}", "glama.ai {repo_name}", "pulsemcp {repo_name}", "mcp.so {repo_name}"
Security forums: "{owner} {repo_name} security vulnerability"
Developer forums: implementation examples and feedback

For each search:

Document exact query used
Summarize relevant results with links
Note security concerns raised by community

Document all findings in "Community Feedback" section with clear source attribution.

使用Bright Data MCP或网络搜索执行特定的网络搜索：

Reddit：“{owner} {repo_name} MCP”
Twitter/X：“{owner} {repo_name} MCP”
MCP目录：“smithery.ai {repo_name}”、“glama.ai {repo_name}”、“pulsemcp {repo_name}”、“mcp.so {repo_name}”
安全论坛：“{owner} {repo_name} security vulnerability”
开发者论坛：实现示例和反馈

对于每个搜索：

记录使用的精确查询
总结相关结果并附上链接
记录社区提出的安全问题

在“社区反馈”部分记录所有发现，并明确标注来源。

5.6 Risk Assessment

5.6 风险评估

Analyze all collected information and evaluate across dimensions:

Dimension	Evaluation Criteria
Security	Protection against attacks, credential handling, code vulnerabilities
Privacy	Data collection practices, data minimization, transmission security
Reliability	Code quality, maintenance activity, error handling
Transparency	Documentation quality, purpose clarity, open source practices
Usability	Setup complexity, integration quality, user experience

For each dimension:

Provide concrete examples supporting the score
List specific strengths and weaknesses
Assign score (0-100) with clear justification

Scoring Guidelines:

0-49: Critical security flaws or dangerous functionality
50-69: Significant security concerns but not immediately dangerous
70-84: Reasonably secure with minor concerns
85-100: Very secure with robust practices

Create "Risk Assessment" section with scoring table and "Final Verdict" with definitive recommendation.

分析所有收集到的信息，并从多个维度进行评估：

维度	评估标准
安全性	抵御攻击的能力、凭证处理、代码漏洞
隐私性	数据收集实践、数据最小化、传输安全性
可靠性	代码质量、维护活跃度、错误处理
透明度	文档质量、用途清晰度、开源实践
易用性	安装复杂度、集成质量、用户体验

对于每个维度：

提供支持评分的具体示例
列出具体的优缺点
给出评分（0-100）并说明清晰的理由

评分指南：

0-49：存在严重安全缺陷或危险功能
50-69：存在重大安全问题，但不会立即造成危险
70-84：相对安全，仅有小问题
85-100：非常安全，实践完善

创建“风险评估”部分，包含评分表和带有明确建议的“最终结论”。

5.7 Usability Assessment

5.7 易用性评估

Evaluate practical aspects:

Installation complexity and requirements
Documentation quality for setup/usage
Configuration options and flexibility
Potential performance issues
Integration smoothness with Claude
Edge cases and limitations

Document in "Usability Assessment" section with specific examples.

评估实际使用方面：

安装复杂度和要求
设置/使用文档的质量
配置选项和灵活性
潜在的性能问题
与Claude的集成流畅度
边缘情况和局限性

在“易用性评估”部分记录，并附上具体示例。

Step 6: Make Confident Judgments

步骤6：给出明确判断

Provide definitive recommendations. Avoid hedging. Be clear about:

Whether users should use this MCP server
Specific use cases where appropriate/inappropriate
Critical risks that must be addressed
Alternatives that may be better

提供明确的建议。避免含糊其辞。清晰说明：

用户是否应该使用该MCP服务器
适合/不适合的具体使用场景
必须解决的关键风险
可能更优的替代方案

Step 7: Completion

步骤7：完成评估

Provide summary of key findings
Link to assessment file in
```
/mnt/user-data/outputs/
```
If PDF requested, convert markdown to PDF

提供关键发现的摘要
提供
```
/mnt/user-data/outputs/
```
目录下的评估文件链接
若用户要求PDF格式，将Markdown转换为PDF

Assessment Document Structure

评估文档结构

Create assessment with this exact structure:

markdown

undefined

按照以下精确结构创建评估文档：

markdown

undefined

Security Assessment: [MCP Server Name]

安全评估：[MCP服务器名称]

Evaluation Overview

评估概述

Repository URL: [GitHub URL]
Evaluation Date: [Current Date]
Evaluator: Claude AI
Repository Owner: [Username/Organization]
Evaluation Methods: [Tools used]
Tool Availability: [Which MCP servers were available]
Executive Summary: [1-2 paragraphs on safety and key risks/benefits]

仓库URL：[GitHub URL]
评估日期：[当前日期]
评估者：Claude AI
仓库所有者：[用户名/组织]
评估方法：[使用的工具]
工具可用性：[可用的MCP服务器]
执行摘要：[1-2段关于安全性和关键风险/优势的内容]

GitHub Repository Assessment

GitHub仓库评估

[Repository stats, contributor analysis, activity patterns]

[仓库统计数据、贡献者分析、活跃度模式]

Server Purpose

服务器用途

[Functionality description, external services, permissions, creator info]

[功能描述、外部服务、权限、创建者信息]

Expected Functionality

预期功能

[Detailed explanation of capabilities, APIs, typical usage, limitations, examples]

[功能、API、典型用法、局限性、示例的详细说明]

Alternative MCP Servers

替代MCP服务器

[List of alternatives with comparisons]

[替代方案列表及对比]

Code Analysis

代码分析

[Security review findings categorized by severity with code snippets]

[按严重程度分类的安全审查发现及代码片段]

Community Feedback

社区反馈

[External references, user reviews, discussions with source attribution]

[外部参考、用户评价、带来源标注的讨论]

Risk Assessment

风险评估

[Comprehensive evaluation across all dimensions]

[跨所有维度的全面评估]

Usability Assessment

易用性评估

[Practical evaluation of setup, documentation, integration]

[对设置、文档、集成的实际评估]

Scoring

评分

Dimension	Score (0-100)	Justification
Security	[Score]	[Specific evidence]
Privacy	[Score]	[Specific evidence]
Reliability	[Score]	[Specific evidence]
Transparency	[Score]	[Specific evidence]
Usability	[Score]	[Specific evidence]
OVERALL RATING	[Score]	[Summary]

维度	评分（0-100）	理由
安全性	[评分]	[具体证据]
隐私性	[评分]	[具体证据]
可靠性	[评分]	[具体证据]
透明度	[评分]	[具体证据]
易用性	[评分]	[具体证据]
总体评分	[评分]	[总结]

Final Verdict

最终结论

[Clear statement on whether to use this MCP server, with specific use cases]

[明确说明是否应使用该MCP服务器，及具体适用场景]

Evaluation Limitations

评估局限性

[If applicable, note any limitations due to unavailable tools]

undefined

[若适用，记录因工具不可用导致的局限性]

undefined

Error Handling

错误处理

If issues occur during evaluation:

Document specific error in assessment file
Note which tool/function failed and error message
List fallback methods used
Mark sections with limited information
Include "Evaluation Limitations" section if significant errors
Continue with remaining steps using alternatives
Provide recommendations based on available information

若评估过程中出现问题：

在评估文件中记录具体错误
记录哪个工具/功能失败及错误信息
列出使用的 fallback 方法
标记信息有限的部分
若存在重大错误，添加“评估局限性”部分
使用替代方案继续完成剩余步骤
根据可用信息提供建议

Ongoing Communication

持续沟通

Keep user informed at key milestones:

When repository files successfully accessed
When GitHub metadata analysis complete
When code review complete
When community validation searches complete
When using fallback methods due to tool unavailability

Show exactly what tools/functions being called and their results. If evaluation requires extended time, provide interim updates.

在关键节点向用户通报进展：

成功访问仓库文件时
GitHub元数据分析完成时
代码审查完成时
社区验证搜索完成时
因工具不可用而使用 fallback 方法时

明确展示正在调用的工具/功能及其结果。若评估需要较长时间，提供阶段性更新。

Key Principles

核心原则

Be Specific, Not Generic:

❌ "This has moderate security concerns"
✅ "Line 47 stores API keys in plain text without encryption (Critical severity)"

Make Confident Judgments:

❌ "This might be relatively safe depending on your use case"
✅ "This MCP server is safe for personal use but should not be used in production environments handling sensitive data due to weak authentication implementation"

Include Evidence: Always back up scores and recommendations with specific code examples, community feedback quotes, or measurable metrics.

Adapt to Available Tools: Use the best tools available but continue evaluation even without ideal tools. Document what methods were used and any resulting limitations.

具体化，而非泛泛而谈：

❌ “存在中等安全问题”
✅ “第47行以明文形式存储API密钥且未加密（严重级别）”

给出明确判断：

❌ “根据使用场景不同，这可能相对安全”
✅ “该MCP服务器可安全用于个人场景，但由于认证实现薄弱，不应在处理敏感数据的生产环境中使用”

提供证据：始终使用具体代码示例、社区反馈引用或可衡量指标来支持评分和建议。

适配可用工具：使用最佳可用工具，但即使没有理想工具也要继续评估。记录使用的方法及由此产生的局限性。

References

参考资料

This skill includes reference documentation in the

references/

directory:

```
mcp_security_patterns.md
```
- Comprehensive catalog of security vulnerabilities and attack patterns specific to MCP servers
```
safe_mcp_examples.md
```
- Examples of legitimate MCP patterns that might look suspicious but are safe

Read these references as needed during code analysis to improve detection accuracy and reduce false positives.

本工具在

references/

目录下包含参考文档：

```
mcp_security_patterns.md
```
- 针对MCP服务器的安全漏洞和攻击模式的全面目录
```
safe_mcp_examples.md
```
- 看似可疑但实际安全的合法MCP模式示例

在代码分析过程中按需查阅这些参考资料，以提高检测准确性并减少误判。