Loading...
Loading...
Intelligent skill retrieval and recommendation system for Claude Code. Uses semantic search, intent analysis, and confidence scoring to recommend the most appropriate skills. Features: (1) Smart skill matching via bilingual embeddings (Chinese/English), (2) Prudent decision-making with three confidence tiers, (3) Historical learning from usage patterns, (4) Automatic health checking and lifecycle management, (5) Intelligent cache cleanup. Use when: User asks to find/recommend a skill, multiple skills might match a request, or skill selection requires intelligent analysis.
npx skill4agent add gccszs/spark-satchel sparksatchel"Think twice before acting, keep the user burden-free"
from src.retriever import SparkSatchel
sparksatchel = SparkSatchel()
result = sparksatchel.retrieve("process this PDF")| Confidence Level | Threshold | Action |
|---|---|---|
| High | >70% | Auto-recommend with reasoning |
| Medium | 40-70% | Recommend primary + alternatives |
| Low | <40% | Present candidates and ask user |
User: "Process this PDF"
SparkSatchel: "I recommend pdf-skill because it specializes in PDF documents (92% historical success rate)"User: "Create a document"
SparkSatchel: "I suggest docx-skill. pdf-skill is also available. Want me to compare them?"User: "Process data"
SparkSatchel: "Found several matching skills. Which one fits best?
- xlsx-skill: Excel spreadsheet processing
- pandas-skill: Data analysis with Python
- csv-skill: CSV file handling"paraphrase-multilingual-MiniLM-L12-v2~/.cache/huggingface/hub/| Model | Size | Languages | Speed | Accuracy | Best For |
|---|---|---|---|---|---|
| paraphrase-multilingual-MiniLM-L12-v2 | 470MB | 50+ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Default choice - Balanced performance |
| shibing624/text2vec-base-chinese | 110MB | Chinese | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Chinese-only - Faster & more accurate |
| intfloat/multilingual-e5-large | 1.3GB | 100+ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | High accuracy - Best for complex queries |
| BAAI/bge-large-zh-v1.5 | 390MB | Chinese | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Chinese advanced - State-of-the-art |
| all-MiniLM-L6-v2 | 23MB | English | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | English only - Ultra lightweight |
# List available models
python scripts/download_model.py --list
# Download default model (already downloaded ✅)
python scripts/download_model.py default
# Download Chinese-optimized model
python scripts/download_model.py chinese
# Download high-accuracy multilingual model
python scripts/download_model.py large
# Download ultra-lightweight English model
python scripts/download_model.py english# Option A: Use download script
python scripts/download_model.py chinese
# Option B: Manual download
pip install sentence-transformers
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('shibing624/text2vec-base-chinese')"# Option A: Use download script
python scripts/download_model.py english
# Option B: Manual download
pip install sentence-transformers
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"# Download high-accuracy model
pip install sentence-transformers
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('intfloat/multilingual-e5-large')"# Use OpenAI API (requires API key)
pip install openaisrc/models/embedding.pyclass EmbeddingModel:
# Change default model
DEFAULT_MODEL = "shibing624/text2vec-base-chinese" # Your choicefrom src.models.embedding import EmbeddingModel
from src.retriever import SparkSatchel
# Use custom model
custom_model = EmbeddingModel(
model_name="shibing624/text2vec-base-chinese",
device="cpu" # or "cuda" for GPU acceleration
)
# Pass to SparkSatchel
sparksatchel = SparkSatchel(embedding_model=custom_model)import openai
def openai_embedding(text: str) -> list:
response = openai.Embedding.create(
model="text-embedding-3-small",
input=text
)
return response['data'][0]['embedding']device="cuda"SparkSatchel/
├── SKILL.md # This file
├── requirements.txt # Dependencies
├── src/
│ ├── retriever.py # Main entry point
│ ├── models/ # Embedding models
│ ├── storage/ # Vector DB + history
│ ├── analysis/ # Intent + confidence
│ └── maintenance/ # Health + lifecycle + cache
└── data/ # Data storage
├── collections/ # Vector databases
└── history.db # Call historyclass SparkSatchel:
def retrieve(self, user_request: str) -> RetrievalResult:
"""Search and recommend skills"""
def feedback(self, skill_name: str, success: bool, feedback: str = ""):
"""Record user feedback"""
def check_health(self) -> Dict:
"""Check system health"""
def cleanup(self, strategy: dict = None):
"""Execute cache cleanup"""@dataclass
class RetrievalResult:
confidence: float # 0-1
recommended_skill: str # Skill name
reasoning: str # Explanation
alternative_skills: List[str] # For medium confidence
candidate_skills: List[Dict] # For low confidence
requires_confirmation: bool # Needs user input?health = sparksatchel.check_health()
if health["cache"]["needs_cleanup"]:
print(health["suggestion"])from src.maintenance.cache import CleanupStrategy
# By age (delete records older than 30 days)
sparksatchel.cleanup(CleanupStrategy.by_age(days=30))
# By count (keep recent 1000 records)
sparksatchel.cleanup(CleanupStrategy.by_count(keep=1000))| Metric | Target |
|---|---|
| Retrieval latency | <500ms (100k skills) |
| Memory usage | <500MB |
| Startup time | <3s |
| Accuracy | >85% |