mcp-evaluator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMCP Server Security Evaluator
MCP服务器安全评估工具
Overview
概述
Automatically evaluate the security, privacy, and reliability of MCP (Model Context Protocol) servers from GitHub repositories. This skill performs comprehensive assessments including code analysis, community feedback research, security vulnerability detection, and risk scoring to provide actionable recommendations.
自动评估GitHub仓库中MCP(Model Context Protocol)服务器的安全性、隐私性和可靠性。该工具会执行全面评估,包括代码分析、社区反馈调研、安全漏洞检测和风险评分,以提供可执行的建议。
When to Use This Skill
适用场景
Use this skill when users:
- Provide a GitHub URL to an MCP server repository
- Ask "is this MCP server safe?"
- Request security assessment of an MCP server
- Want to evaluate privacy risks before installing an MCP server
- Need to compare MCP servers with similar functionality
- Ask about community feedback or reviews of an MCP server
当用户有以下需求时使用本工具:
- 提供MCP服务器仓库的GitHub URL
- 询问“这个MCP服务器是否安全?”
- 请求对MCP服务器进行安全评估
- 想要在安装MCP服务器前评估隐私风险
- 需要对比功能类似的MCP服务器
- 询问MCP服务器的社区反馈或评价
Tool Strategy
工具策略
This skill works with or without MCP servers through a graceful degradation approach:
For GitHub repositories:
- Priority: GitHub MCP (if available) for direct repository API access
- Alternatives: Bright Data MCP (The Web MCP) or built-in web tools for scraping
- Optional: Sequential Thinking MCP for systematic analysis (recommended but not required)
- Fallback: Claude's built-in web search when no MCP servers available
For web search and community validation:
- Priority: Bright Data MCP or Brave Search MCP for web search and content fetching
- Fallback: Claude's built-in web search
本工具通过优雅降级的方式,可在有或没有MCP服务器的情况下工作:
针对GitHub仓库:
- 优先选择:GitHub MCP(若可用),用于直接访问仓库API
- 替代方案:Bright Data MCP(Web MCP)或内置网络工具用于爬取数据
- 可选:Sequential Thinking MCP用于系统化分析(推荐但非必需)
- ** fallback方案**:当无MCP服务器可用时,使用Claude的内置网络搜索
针对网络搜索和社区验证:
- 优先选择:Bright Data MCP或Brave Search MCP用于网络搜索和内容获取
- ** fallback方案**:使用Claude的内置网络搜索
Evaluation Workflow
评估流程
Step 1: Initial Setup
步骤1:初始设置
Ask the user their preferred output format:
- Markdown (.md) - default
- PDF (.pdf) - requires conversion after markdown creation
Acknowledge receipt and inform user that evaluation is beginning. Parse the GitHub URL to extract owner and repository name.
询问用户偏好的输出格式:
- Markdown(.md)- 默认格式
- PDF(.pdf)- 需要先创建Markdown再转换
确认收到请求并告知用户评估已开始。解析GitHub URL以提取仓库所有者和仓库名称。
Step 2: Tool Assessment
步骤2:工具评估
Check which tools are available and plan the evaluation approach:
- If GitHub MCP available: use for repository access (preferred for GitHub repos)
- If Bright Data MCP available: use for web scraping and searching (or as GitHub alternative)
- If neither available: use Claude's built-in capabilities with noted limitations
检查可用工具并规划评估方法:
- 若GitHub MCP可用:用于仓库访问(GitHub仓库的首选方式)
- 若Bright Data MCP可用:用于网络爬取和搜索(或作为GitHub的替代方案)
- 若两者都不可用:使用Claude的内置功能,并注明局限性
Step 3: Create Assessment File
步骤3:创建评估文件
Use built-in tool to create assessment file in :
create_file/mnt/user-data/outputs/- File naming:
MCP_Security_Assessment_{owner}_{repo_name}.md - Update iteratively throughout evaluation process
使用内置的工具在目录下创建评估文件:
create_file/mnt/user-data/outputs/- 文件命名:
MCP_Security_Assessment_{owner}_{repo_name}.md - 在整个评估过程中迭代更新文件
Step 4: Repository Content Access
步骤4:仓库内容访问
With GitHub MCP (Priority):
- Use GitHub MCP tools to directly access:
- Repository metadata and statistics
- File contents: README.md, package.json, LICENSE, source files
- Commit history via for activity analysis
list_commits - Repository tree/structure
- Use for similar MCP servers
search_repositories
With Bright Data MCP (Alternative):
- Use to retrieve:
scrape_as_markdown- Repository main page:
https://github.com/{owner}/{repo} - README:
https://github.com/{owner}/{repo}/blob/main/README.md - Raw files:
https://raw.githubusercontent.com/{owner}/{repo}/main/{filepath} - Key code files: package.json, index.js, src files, etc.
- Repository main page:
Fallback Without MCP:
- Use Claude's built-in web tools for available information
- Note limitations in assessment
- Request user provide critical files if needed
Document each file examined with code snippets of important sections.
使用GitHub MCP(优先):
- 使用GitHub MCP工具直接访问:
- 仓库元数据和统计信息
- 文件内容:README.md、package.json、LICENSE、源文件
- 通过获取提交历史以进行活跃度分析
list_commits - 仓库目录/结构
- 使用查找类似的MCP服务器
search_repositories
使用Bright Data MCP(替代):
- 使用获取:
scrape_as_markdown- 仓库主页:
https://github.com/{owner}/{repo} - README:
https://github.com/{owner}/{repo}/blob/main/README.md - 原始文件:
https://raw.githubusercontent.com/{owner}/{repo}/main/{filepath} - 关键代码文件:package.json、index.js、src文件等
- 仓库主页:
无MCP时的Fallback方案:
- 使用Claude的内置网络工具获取可用信息
- 在评估中注明局限性
- 若需要,请求用户提供关键文件
记录每个检查的文件,并附上重要部分的代码片段。
Step 5: Sequential Evaluation
步骤5:顺序评估
Execute evaluation in this order, updating assessment file after each step:
按以下顺序执行评估,每完成一步就更新评估文件:
5.1 Repository Setup & Metadata
5.1 仓库设置与元数据
- Extract repository statistics (stars, forks, contributors, activity)
- Analyze commit history and frequency
- Review contributor diversity and patterns
- Check for security policies and contribution guidelines
- Document findings in "GitHub Repository Assessment" section
- 提取仓库统计信息(星标数、复刻数、贡献者、活跃度)
- 分析提交历史和频率
- 审查贡献者的多样性和模式
- 检查安全政策和贡献指南
- 在“GitHub仓库评估”部分记录发现
5.2 Purpose & Functionality Analysis
5.2 用途与功能分析
- Review README and documentation thoroughly
- Identify stated purpose and capabilities
- List external services/APIs the server connects to
- Note required permissions and access levels
- Identify creator/maintainer background
- Create "Server Purpose" and "Expected Functionality" sections
- 仔细审阅README和文档
- 确定声明的用途和功能
- 列出服务器连接的外部服务/API
- 记录所需的权限和访问级别
- 确定创建者/维护者背景
- 创建“服务器用途”和“预期功能”部分
5.3 Alternatives Analysis
5.3 替代方案分析
Search for alternative MCP servers with similar functionality:
- Use web search: "{functionality} MCP server"
- Check MCP directories: Smithery, Glama, PulseMCP, MCP.so
- Review repository forks for improved versions
- Document 2-3 alternatives minimum with comparisons
- Create "Alternative MCP Servers" section
搜索功能类似的其他MCP服务器:
- 使用网络搜索:“{functionality} MCP server”
- 检查MCP目录:Smithery、Glama、PulseMCP、MCP.so
- 查看仓库复刻版本以寻找改进版
- 至少记录2-3个替代方案并进行对比
- 创建“替代MCP服务器”部分
5.4 Code Review
5.4 代码审查
Analyze codebase for:
- Authentication mechanisms and credential handling
- Data collection, storage, and transmission practices
- Security practices (input validation, encryption, sanitization)
- Suspicious or unexpected behaviors
- Code quality and error handling
Reference the security patterns documentation: Review to identify known vulnerability patterns, and to avoid false positives from legitimate patterns.
references/mcp_security_patterns.mdreferences/safe_mcp_examples.mdBe specific: Include actual code snippets as evidence. Categorize findings by severity (Critical, High, Medium, Low). Focus on concrete vulnerabilities, not generic statements.
Document in "Code Analysis" section.
分析代码库以检查:
- 认证机制和凭证处理
- 数据收集、存储和传输实践
- 安全实践(输入验证、加密、清理)
- 可疑或意外行为
- 代码质量和错误处理
参考安全模式文档:查看以识别已知漏洞模式,查看以避免将合法模式误判为风险。
references/mcp_security_patterns.mdreferences/safe_mcp_examples.md具体化:包含实际代码片段作为证据。按严重程度(关键、高、中、低)对发现进行分类。专注于具体漏洞,而非泛泛而谈。
在“代码分析”部分记录。
5.5 Community Validation
5.5 社区验证
Perform specific web searches using Bright Data MCP or web search:
- Reddit: "{owner} {repo_name} MCP"
- Twitter/X: "{owner} {repo_name} MCP"
- MCP Directories: "smithery.ai {repo_name}", "glama.ai {repo_name}", "pulsemcp {repo_name}", "mcp.so {repo_name}"
- Security forums: "{owner} {repo_name} security vulnerability"
- Developer forums: implementation examples and feedback
For each search:
- Document exact query used
- Summarize relevant results with links
- Note security concerns raised by community
Document all findings in "Community Feedback" section with clear source attribution.
使用Bright Data MCP或网络搜索执行特定的网络搜索:
- Reddit:“{owner} {repo_name} MCP”
- Twitter/X:“{owner} {repo_name} MCP”
- MCP目录:“smithery.ai {repo_name}”、“glama.ai {repo_name}”、“pulsemcp {repo_name}”、“mcp.so {repo_name}”
- 安全论坛:“{owner} {repo_name} security vulnerability”
- 开发者论坛:实现示例和反馈
对于每个搜索:
- 记录使用的精确查询
- 总结相关结果并附上链接
- 记录社区提出的安全问题
在“社区反馈”部分记录所有发现,并明确标注来源。
5.6 Risk Assessment
5.6 风险评估
Analyze all collected information and evaluate across dimensions:
| Dimension | Evaluation Criteria |
|---|---|
| Security | Protection against attacks, credential handling, code vulnerabilities |
| Privacy | Data collection practices, data minimization, transmission security |
| Reliability | Code quality, maintenance activity, error handling |
| Transparency | Documentation quality, purpose clarity, open source practices |
| Usability | Setup complexity, integration quality, user experience |
For each dimension:
- Provide concrete examples supporting the score
- List specific strengths and weaknesses
- Assign score (0-100) with clear justification
Scoring Guidelines:
- 0-49: Critical security flaws or dangerous functionality
- 50-69: Significant security concerns but not immediately dangerous
- 70-84: Reasonably secure with minor concerns
- 85-100: Very secure with robust practices
Create "Risk Assessment" section with scoring table and "Final Verdict" with definitive recommendation.
分析所有收集到的信息,并从多个维度进行评估:
| 维度 | 评估标准 |
|---|---|
| 安全性 | 抵御攻击的能力、凭证处理、代码漏洞 |
| 隐私性 | 数据收集实践、数据最小化、传输安全性 |
| 可靠性 | 代码质量、维护活跃度、错误处理 |
| 透明度 | 文档质量、用途清晰度、开源实践 |
| 易用性 | 安装复杂度、集成质量、用户体验 |
对于每个维度:
- 提供支持评分的具体示例
- 列出具体的优缺点
- 给出评分(0-100)并说明清晰的理由
评分指南:
- 0-49:存在严重安全缺陷或危险功能
- 50-69:存在重大安全问题,但不会立即造成危险
- 70-84:相对安全,仅有小问题
- 85-100:非常安全,实践完善
创建“风险评估”部分,包含评分表和带有明确建议的“最终结论”。
5.7 Usability Assessment
5.7 易用性评估
Evaluate practical aspects:
- Installation complexity and requirements
- Documentation quality for setup/usage
- Configuration options and flexibility
- Potential performance issues
- Integration smoothness with Claude
- Edge cases and limitations
Document in "Usability Assessment" section with specific examples.
评估实际使用方面:
- 安装复杂度和要求
- 设置/使用文档的质量
- 配置选项和灵活性
- 潜在的性能问题
- 与Claude的集成流畅度
- 边缘情况和局限性
在“易用性评估”部分记录,并附上具体示例。
Step 6: Make Confident Judgments
步骤6:给出明确判断
Provide definitive recommendations. Avoid hedging. Be clear about:
- Whether users should use this MCP server
- Specific use cases where appropriate/inappropriate
- Critical risks that must be addressed
- Alternatives that may be better
提供明确的建议。避免含糊其辞。清晰说明:
- 用户是否应该使用该MCP服务器
- 适合/不适合的具体使用场景
- 必须解决的关键风险
- 可能更优的替代方案
Step 7: Completion
步骤7:完成评估
- Provide summary of key findings
- Link to assessment file in
/mnt/user-data/outputs/ - If PDF requested, convert markdown to PDF
- 提供关键发现的摘要
- 提供目录下的评估文件链接
/mnt/user-data/outputs/ - 若用户要求PDF格式,将Markdown转换为PDF
Assessment Document Structure
评估文档结构
Create assessment with this exact structure:
markdown
undefined按照以下精确结构创建评估文档:
markdown
undefinedSecurity Assessment: [MCP Server Name]
安全评估:[MCP服务器名称]
Evaluation Overview
评估概述
- Repository URL: [GitHub URL]
- Evaluation Date: [Current Date]
- Evaluator: Claude AI
- Repository Owner: [Username/Organization]
- Evaluation Methods: [Tools used]
- Tool Availability: [Which MCP servers were available]
- Executive Summary: [1-2 paragraphs on safety and key risks/benefits]
- 仓库URL:[GitHub URL]
- 评估日期:[当前日期]
- 评估者:Claude AI
- 仓库所有者:[用户名/组织]
- 评估方法:[使用的工具]
- 工具可用性:[可用的MCP服务器]
- 执行摘要:[1-2段关于安全性和关键风险/优势的内容]
GitHub Repository Assessment
GitHub仓库评估
[Repository stats, contributor analysis, activity patterns]
[仓库统计数据、贡献者分析、活跃度模式]
Server Purpose
服务器用途
[Functionality description, external services, permissions, creator info]
[功能描述、外部服务、权限、创建者信息]
Expected Functionality
预期功能
[Detailed explanation of capabilities, APIs, typical usage, limitations, examples]
[功能、API、典型用法、局限性、示例的详细说明]
Alternative MCP Servers
替代MCP服务器
[List of alternatives with comparisons]
[替代方案列表及对比]
Code Analysis
代码分析
[Security review findings categorized by severity with code snippets]
[按严重程度分类的安全审查发现及代码片段]
Community Feedback
社区反馈
[External references, user reviews, discussions with source attribution]
[外部参考、用户评价、带来源标注的讨论]
Risk Assessment
风险评估
[Comprehensive evaluation across all dimensions]
[跨所有维度的全面评估]
Usability Assessment
易用性评估
[Practical evaluation of setup, documentation, integration]
[对设置、文档、集成的实际评估]
Scoring
评分
| Dimension | Score (0-100) | Justification |
|---|---|---|
| Security | [Score] | [Specific evidence] |
| Privacy | [Score] | [Specific evidence] |
| Reliability | [Score] | [Specific evidence] |
| Transparency | [Score] | [Specific evidence] |
| Usability | [Score] | [Specific evidence] |
| OVERALL RATING | [Score] | [Summary] |
| 维度 | 评分(0-100) | 理由 |
|---|---|---|
| 安全性 | [评分] | [具体证据] |
| 隐私性 | [评分] | [具体证据] |
| 可靠性 | [评分] | [具体证据] |
| 透明度 | [评分] | [具体证据] |
| 易用性 | [评分] | [具体证据] |
| 总体评分 | [评分] | [总结] |
Final Verdict
最终结论
[Clear statement on whether to use this MCP server, with specific use cases]
[明确说明是否应使用该MCP服务器,及具体适用场景]
Evaluation Limitations
评估局限性
[If applicable, note any limitations due to unavailable tools]
undefined[若适用,记录因工具不可用导致的局限性]
undefinedError Handling
错误处理
If issues occur during evaluation:
- Document specific error in assessment file
- Note which tool/function failed and error message
- List fallback methods used
- Mark sections with limited information
- Include "Evaluation Limitations" section if significant errors
- Continue with remaining steps using alternatives
- Provide recommendations based on available information
若评估过程中出现问题:
- 在评估文件中记录具体错误
- 记录哪个工具/功能失败及错误信息
- 列出使用的 fallback 方法
- 标记信息有限的部分
- 若存在重大错误,添加“评估局限性”部分
- 使用替代方案继续完成剩余步骤
- 根据可用信息提供建议
Ongoing Communication
持续沟通
Keep user informed at key milestones:
- When repository files successfully accessed
- When GitHub metadata analysis complete
- When code review complete
- When community validation searches complete
- When using fallback methods due to tool unavailability
Show exactly what tools/functions being called and their results. If evaluation requires extended time, provide interim updates.
在关键节点向用户通报进展:
- 成功访问仓库文件时
- GitHub元数据分析完成时
- 代码审查完成时
- 社区验证搜索完成时
- 因工具不可用而使用 fallback 方法时
明确展示正在调用的工具/功能及其结果。若评估需要较长时间,提供阶段性更新。
Key Principles
核心原则
Be Specific, Not Generic:
- ❌ "This has moderate security concerns"
- ✅ "Line 47 stores API keys in plain text without encryption (Critical severity)"
Make Confident Judgments:
- ❌ "This might be relatively safe depending on your use case"
- ✅ "This MCP server is safe for personal use but should not be used in production environments handling sensitive data due to weak authentication implementation"
Include Evidence:
Always back up scores and recommendations with specific code examples, community feedback quotes, or measurable metrics.
Adapt to Available Tools:
Use the best tools available but continue evaluation even without ideal tools. Document what methods were used and any resulting limitations.
具体化,而非泛泛而谈:
- ❌ “存在中等安全问题”
- ✅ “第47行以明文形式存储API密钥且未加密(严重级别)”
给出明确判断:
- ❌ “根据使用场景不同,这可能相对安全”
- ✅ “该MCP服务器可安全用于个人场景,但由于认证实现薄弱,不应在处理敏感数据的生产环境中使用”
提供证据:
始终使用具体代码示例、社区反馈引用或可衡量指标来支持评分和建议。
适配可用工具:
使用最佳可用工具,但即使没有理想工具也要继续评估。记录使用的方法及由此产生的局限性。
References
参考资料
This skill includes reference documentation in the directory:
references/- - Comprehensive catalog of security vulnerabilities and attack patterns specific to MCP servers
mcp_security_patterns.md - - Examples of legitimate MCP patterns that might look suspicious but are safe
safe_mcp_examples.md
Read these references as needed during code analysis to improve detection accuracy and reduce false positives.
本工具在目录下包含参考文档:
references/- - 针对MCP服务器的安全漏洞和攻击模式的全面目录
mcp_security_patterns.md - - 看似可疑但实际安全的合法MCP模式示例
safe_mcp_examples.md
在代码分析过程中按需查阅这些参考资料,以提高检测准确性并减少误判。