Loading...
Loading...
Found 1,211 Skills
Guide for building high-quality MCP (Model Context Protocol) servers in Python or Node/TypeScript to integrate external APIs/services.
LangGraph-based agent framework for consistent tool calling with automatic tool loops. Use when you need reliable multi-step task execution with OpenAI-compatible providers (Z.AI/GLM-5, OpenRouter, Groq, DeepSeek, Ollama).
Evaluate the quality of CAW (Cobo Agentic Wallet) Agent in local Claude Code, and generate scoring data and analysis reports. Use when: Users want to run CAW evaluation, conduct evaluation, test Skill, assess Agent quality, generate evaluation reports, or say "run evaluation", "evaluate CAW", "eval", "score". For weak model / openclaw evaluation, please use caw-eval-openclaw (only installed on openclaw servers).
Audit experiment integrity before claiming results. Uses cross-model review (GPT-5.4) to check for fake ground truth, score normalization fraud, phantom results, and insufficient scope. Use when user says "审计实验", "check experiment integrity", "audit results", "实验诚实度", or after experiments complete before writing claims.
Comprehensive multi-perspective review using specialized judges with debate and consensus building
Engineer system prompts for LiveKit voice agents with multilingual support. Use when creating or optimizing AI agent conversation flows.
Intelligent system governor that continuously shadow-tests APIs for performance while enforcing strict financial and security guardrails against runaway costs.
Phone calling skill for OpenClaw: agent makes real outbound phone calls to users for alerts, briefings, reminders, and urgent notifications. Managed service, no Twilio setup needed. 100+ countries, 70+ voices.
Use when bootstrapping a new personal wiki for any knowledge domain — research, codebase documentation, reading notes, competitive analysis, or any long-term knowledge accumulation project.
Curated collection of 1209+ best OpenClaw AI agent skills, weekly updated by MyClaw.ai
Debug AutoDeploy accuracy regressions vs a reference score (PyTorch backend or published baseline). Use when an AutoDeploy model's eval score is significantly below the reference and the root cause is unknown.
Multi-model deep review of the Ralph bd graph and plan via three parallel opencode processes (claude opus, gemini, gpt). Use for high-stakes runs where cross-model consensus reduces single-model bias.