Loading...
Loading...
Create high-quality ToolUniverse skills following test-driven, implementation-agnostic methodology. Integrates tools from ToolUniverse's 1,264+ tool library, creates missing tools when needed using devtu-create-tool, tests thoroughly, and produces skills with Python SDK + MCP support. Use when asked to create new ToolUniverse skills, build research workflows, or develop domain-specific analysis capabilities for biology, chemistry, or medicine.
npx skill4agent add mims-harvard/tooluniverse create-tooluniverse-skilldevtu-optimize-skillsoperationPhase 1: Domain Analysis →
Phase 2: Tool Discovery & Testing →
Phase 3: Tool Creation (if needed) →
Phase 4: Implementation →
Phase 5: Documentation →
Phase 6: Testing & Validation →
Phase 7: Packagingskills/tooluniverse-systems-biologytooluniverse-target-researchtooluniverse-drug-research/src/tooluniverse/data/*.jsongrep -r "keyword" src/tooluniverse/data/*.json# Find metabolomics tools
grep -l "metabol" src/tooluniverse/data/*.json
# Find cancer-related tools
grep -l "cancer\|tumor\|oncology" src/tooluniverse/data/*.json
# Find single-cell tools
grep -l "single.cell\|scRNA" src/tooluniverse/data/*.jsonoperationscripts/test_tools_template.py#!/usr/bin/env python3
"""
Test script for [Domain] tools
Following TDD: test ALL tools BEFORE creating skill documentation
"""
from tooluniverse import ToolUniverse
import json
def test_database_1():
"""Test Database 1 tools"""
tu = ToolUniverse()
tu.load_tools()
# Test tool 1
result = tu.tools.TOOL_NAME(param="value")
print(f"Status: {result.get('status')}")
# Verify response format
# Check data structure
def test_database_2():
"""Test Database 2 tools"""
# Similar structure
def main():
"""Run all tests"""
tests = [
("Database 1", test_database_1),
("Database 2", test_database_2),
]
results = {}
for name, test_func in tests:
try:
test_func()
results[name] = "✅ PASS"
except Exception as e:
results[name] = f"❌ FAIL: {e}"
# Print summary
for name, result in results.items():
print(f"{name}: {result}")
if __name__ == "__main__":
main()python test_[domain]_tools.pyoperation| Tool | Common Mistake | Correct Parameter | Evidence |
|---|---|---|---|
| TOOL_NAME | assumed_param | actual_param | Test result |
"I need to create a tool for [database/API]. The API endpoint is [URL],
it takes parameters [list], and returns [format]. Can you help create this tool?"def test_new_tool():
"""Test newly created tool"""
tu = ToolUniverse()
tu.load_tools()
result = tu.tools.NEW_TOOL_NAME(param="test_value")
assert result.get('status') == 'success', "New tool failed"
# Verify data structure matches expectations"The tool [TOOL_NAME] is returning errors: [error message].
Can you help fix it?"mkdir -p skills/tooluniverse-[domain-name]
cd skills/tooluniverse-[domain-name]assets/skill_template/python_implementation.py#!/usr/bin/env python3
"""
[Domain Name] - Python SDK Implementation
Tested implementation following TDD principles
"""
from tooluniverse import ToolUniverse
from datetime import datetime
def domain_analysis_pipeline(
input_param_1=None,
input_param_2=None,
output_file=None
):
"""
[Domain] analysis pipeline.
Args:
input_param_1: Description
input_param_2: Description
output_file: Output markdown file path
Returns:
Path to generated report file
"""
tu = ToolUniverse()
tu.load_tools()
# Generate output filename
if output_file is None:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_file = f"domain_analysis_{timestamp}.md"
# Initialize report
report = []
report.append("# [Domain] Analysis Report\n")
report.append(f"**Generated**: {datetime.now()}\n\n")
# Phase 1: [Description]
report.append("## 1. [Phase Name]\n")
try:
result = tu.tools.TOOL_NAME(param=value)
if result.get('status') == 'success':
data = result.get('data', [])
# Process and add to report
else:
report.append("*[Phase] data unavailable.*\n")
except Exception as e:
report.append(f"*Error in [Phase]: {str(e)}*\n")
# Phase 2, 3, 4... (similar structure)
# Write report
with open(output_file, 'w') as f:
f.write(''.join(report))
print(f"\n✅ Report generated: {output_file}")
return output_file
if __name__ == "__main__":
# Example usage
domain_analysis_pipeline(
input_param_1="example",
output_file="example_analysis.md"
)assets/skill_template/test_skill.py#!/usr/bin/env python3
"""Test script for [Domain] skill"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from python_implementation import domain_analysis_pipeline
def test_basic_analysis():
"""Test basic analysis workflow"""
output = domain_analysis_pipeline(
input_param_1="test_value",
output_file="test_basic.md"
)
assert os.path.exists(output), "Output file not created"
def test_combined_analysis():
"""Test with multiple inputs"""
output = domain_analysis_pipeline(
input_param_1="value1",
input_param_2="value2",
output_file="test_combined.md"
)
# Verify report has all sections
with open(output, 'r') as f:
content = f.read()
assert "Phase 1" in content
assert "Phase 2" in content
def main():
tests = [
("Basic Analysis", test_basic_analysis),
("Combined Analysis", test_combined_analysis),
]
results = {}
for name, test_func in tests:
try:
test_func()
results[name] = "✅ PASS"
except Exception as e:
results[name] = f"❌ FAIL: {e}"
for name, result in results.items():
print(f"{name}: {result}")
all_passed = all("PASS" in r for r in results.values())
return 0 if all_passed else 1
if __name__ == "__main__":
sys.exit(main())python test_skill.pyassets/skill_template/SKILL.md---
name: tooluniverse-[domain-name]
description: [What it does]. [Capabilities]. [Databases used]. Use when [triggers].
---
# [Domain Name] Analysis
[One paragraph overview]
## When to Use This Skill
**Triggers**:
- "Analyze [domain] for [input]"
- "Find [data type] related to [query]"
- "[Domain-specific action]"
**Use Cases**:
1. [Use case 1 with description]
2. [Use case 2 with description]
## Core Databases Integrated
| Database | Coverage | Strengths |
|----------|----------|-----------|
| **Database 1** | [Scope] | [What it's good for] |
| **Database 2** | [Scope] | [What it's good for] |
## Workflow Overview
---
## Phase 1: [Phase Name]
**When**: [Conditions]
**Objective**: [What this phase achieves]
### Tools Used
**TOOL_NAME**:
- **Input**:
- `parameter1`: Description
- `parameter2`: Description
- **Output**: Description
- **Use**: What it's used for
### Workflow
1. [Step 1]
2. [Step 2]
3. [Step 3]
### Decision Logic
- **Condition 1**: Action to take
- **Empty results**: How to handle
- **Errors**: Fallback strategy
---
## Phase 2, 3, 4... (similar structure)
---
## Output Structure
[Description of report format]
### Report Format
**Required Sections**:
1. Header with parameters
2. Phase 1 results
3. Phase 2 results
...
---
## Tool Parameter Reference
**Critical Parameter Notes** (from testing):
| Tool | Parameter | CORRECT Name | Common Mistake |
|------|-----------|--------------|----------------|
| TOOL_NAME | `param` | ✅ `actual_param` | ❌ `assumed_param` |
**Response Format Notes**:
- **TOOL_1**: Returns [format]
- **TOOL_2**: Returns [format]
---
## Fallback Strategies
[Document Primary → Fallback → Default for critical tools]
---
## Limitations & Known Issues
### Database-Specific
- **Database 1**: [Limitations]
- **Database 2**: [Limitations]
### Technical
- **Response formats**: [Notes]
- **Rate limits**: [If any]
---
## Summary
[Domain] skill provides:
1. ✅ [Capability 1]
2. ✅ [Capability 2]
**Outputs**: [Description]
**Best for**: [Use cases]assets/skill_template/QUICK_START.md## Quick Start: [Domain] Analysis
[One paragraph overview]
---
## Choose Your Implementation
### Python SDK
#### Option 1: Complete Pipeline (Recommended)
```python
from skills.tooluniverse_[domain].python_implementation import domain_pipeline
# Example 1
domain_pipeline(
input_param="value",
output_file="analysis.md"
)from tooluniverse import ToolUniverse
tu = ToolUniverse()
tu.load_tools()
# Tool 1
result = tu.tools.TOOL_NAME(param="value")
# Tool 2
result = tu.tools.TOOL_NAME2(param="value")"Analyze [domain] for [input]"
"Find [data] related to [query]"{
"tool": "TOOL_NAME",
"parameters": {
"param": "value"
}
}| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | [Description] |
| integer | No | [Description] |
[Code example][Conversational example]
**Key principles**:
- Equal treatment of Python SDK and MCP
- Concrete examples for both
- Parameter table applies to both
- Clear recipes
---
## Phase 6: Testing & Validation
**Objective**: Verify skill meets all quality standards through comprehensive testing
**Duration**: 30-45 minutes
**CRITICAL**: This phase is MANDATORY - no skill is complete without passing comprehensive tests
### 6.1 Create Comprehensive Test Suite
**File**: `test_skill_comprehensive.py`
**CRITICAL**: Test ALL use cases from documentation + edge cases
**Test structure**:
```python
#!/usr/bin/env python3
"""
Comprehensive Test Suite for [Domain] Skill
Tests all use cases from SKILL.md + edge cases
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__)))
from python_implementation import skill_function
from tooluniverse import ToolUniverse
def test_1_use_case_from_skill_md():
"""Test Case 1: [Use case name from SKILL.md]"""
print("\n" + "="*80)
print("TEST 1: [Use Case Name]")
print("="*80)
print("Expected: [What should happen]")
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="value")
# Validation
assert isinstance(result, ExpectedType), "Should return expected type"
assert result.field_name is not None, "Should have required field"
print(f"\n✅ PASS: [What passed]")
return result
def test_2_documentation_accuracy():
"""Test Case 2: QUICK_START.md example works exactly as documented"""
print("\n" + "="*80)
print("TEST 2: Documentation Accuracy")
print("="*80)
# Exact copy-paste from QUICK_START.md
tu = ToolUniverse()
result = skill_function(tu=tu, param="value") # From docs
# Verify documented attributes exist
assert hasattr(result, 'documented_field'), "Doc says this field exists"
print(f"\n✅ PASS: Documentation examples work")
return result
def test_3_edge_case_invalid_input():
"""Test Case 3: Error handling with invalid inputs"""
print("\n" + "="*80)
print("TEST 3: Error Handling")
print("="*80)
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="INVALID")
# Should handle gracefully, not crash
assert isinstance(result, ExpectedType), "Should still return result"
assert len(result.warnings) > 0, "Should have warnings"
print(f"\n✅ PASS: Handled invalid input gracefully")
return result
def test_4_result_structure():
"""Test Case 4: Result structure matches documentation"""
print("\n" + "="*80)
print("TEST 4: Result Structure")
print("="*80)
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="value")
# Check all documented fields
required_fields = ['field1', 'field2', 'field3']
for field in required_fields:
assert hasattr(result, field), f"Missing field: {field}"
print(f"\n✅ PASS: All documented fields present")
return result
def test_5_parameter_validation():
"""Test Case 5: All documented parameters work"""
print("\n" + "="*80)
print("TEST 5: Parameter Validation")
print("="*80)
tu = ToolUniverse()
result = skill_function(
tu=tu,
param1="value1", # All documented params
param2="value2",
param3=True
)
assert isinstance(result, ExpectedType), "Should work with all params"
print(f"\n✅ PASS: All parameters accepted")
return result
def run_all_tests():
"""Run all tests and generate report"""
print("\n" + "="*80)
print("[DOMAIN] SKILL - COMPREHENSIVE TEST SUITE")
print("="*80)
tests = [
("Use Case 1", test_1_use_case_from_skill_md),
("Documentation Accuracy", test_2_documentation_accuracy),
("Error Handling", test_3_edge_case_invalid_input),
("Result Structure", test_4_result_structure),
("Parameter Validation", test_5_parameter_validation),
]
results = {}
passed = 0
failed = 0
for test_name, test_func in tests:
try:
result = test_func()
results[test_name] = {"status": "PASS", "result": result}
passed += 1
except Exception as e:
results[test_name] = {"status": "FAIL", "error": str(e)}
failed += 1
print(f"\n❌ FAIL: {test_name}")
print(f" Error: {e}")
# Summary
print("\n" + "="*80)
print("TEST SUMMARY")
print("="*80)
print(f"✅ Passed: {passed}/{len(tests)}")
print(f"❌ Failed: {failed}/{len(tests)}")
print(f"📊 Success Rate: {passed/len(tests)*100:.1f}%")
if failed == 0:
print("\n🎉 ALL TESTS PASSED! Skill is production-ready.")
else:
print("\n⚠️ Some tests failed. Review errors above.")
return passed, failed
if __name__ == "__main__":
passed, failed = run_all_tests()
sys.exit(0 if failed == 0 else 1)cd skills/tooluniverse-[domain]
python test_skill_comprehensive.py > test_output.txt 2>&1SKILL_TESTING_REPORT.md# [Domain] Skill - Testing Report
**Date**: [Date]
**Status**: ✅ PASS / ❌ FAIL
**Success Rate**: X/Y tests passed (100%)
## Executive Summary
[Brief summary of testing results]
## Test Results
### Test 1: [Use Case Name] ✅
**Use Case**: [Description from SKILL.md]
**Result**: [What happened]
**Validation**: [What was verified]
### Test 2-5: [Similar format]
## Quality Metrics
- **Code quality**: [Assessment]
- **Documentation accuracy**: [All examples work / Issues found]
- **Robustness**: [Error handling assessment]
- **User experience**: [Assessment]
## Recommendation
✅ PRODUCTION-READY / ❌ NEEDS FIXESoperationNEW_SKILL_[DOMAIN].md# New Skill Created: [Domain] Analysis
**Date**: [Date]
**Status**: ✅ COMPLETE AND TESTED
## Overview
[Brief description]
### Key Features
- [Feature 1]
- [Feature 2]
## Skill Details
**Location**: `skills/tooluniverse-[domain]/`
**Files Created**:
1. ✅ `python_implementation.py` ([lines]) - Working pipeline
2. ✅ `SKILL.md` ([lines]) - Implementation-agnostic docs
3. ✅ `QUICK_START.md` ([lines]) - Multi-implementation guide
4. ✅ `test_skill.py` ([lines]) - Test suite
**Total**: [total lines]
## Tools Integrated
- **Database 1**: [X tools]
- **Database 2**: [Y tools]
**Total**: [N tools]
## Test Results
## Capabilities
[List key capabilities]
## Impact & Value
[Describe impact]
## Next Steps
[Suggest enhancements]| Pattern | Don't Assume | Always Verify |
|---|---|---|
| Function name includes param name | | Test reveals uses |
| Descriptive function name | | Test reveals uses |
| Consistent naming | All similar functions use same param | Each tool may differ |
operationoperation{status: "success", data: [...]}[...]{field1: ..., field2: ...}isinstance()operationreferences/tool_testing_workflow.mdreferences/implementation_agnostic_format.mdreferences/skill_standards_checklist.mdreferences/devtu_optimize_integration.mdassets/skill_template/python_implementation.pySKILL.mdQUICK_START.mdtest_skill.py