Search Results: benchmarking

Found 147 Skills

AI & Machine Learningbbuf/sglang-auto-driven-s...

sglang-sota-humanize-loop

Run an autonomous Humanize-governed SGLang SOTA performance loop for one LLM model: first perform the fixed fair SGLang/vLLM/TensorRT-LLM deployment search and benchmark, then start one RLCR loop that repeatedly decides the gap, profiles the current bottleneck, runs layer/kernel pipeline analysis, patches SGLang code, optionally uses ncu-report-skill for kernel evidence, and revalidates until SGLang matches or beats the best observed framework under the same workload and SLA.

🇺🇸|EnglishTranslated

Tools & Utilitieskaggle/kaggle-skills

write-kaggle-benchmarks

Write, push, run, publish, and manage Kaggle Benchmark tasks using the kaggle CLI and the kaggle-benchmarks Python SDK. Use when the user wants to create or push a benchmark task (optionally with attached Kaggle datasets), run benchmarks against LLM models, check task/run status, stream or fetch execution logs, download results and source notebooks, publish a task to make it public, or troubleshoot benchmark workflows.

🇺🇸|EnglishTranslated

Product & Designwellapp-ai/well

competitor-scan

Research best-in-class products using Browser MCP and WebSearch

🇺🇸|EnglishTranslated

Code Qualitydralgorhythm/claude-agent...

optimizing-code

Improve code performance without changing behavior. Use when code fails latency/throughput requirements. Covers profiling, caching, and algorithmic optimization.

🇺🇸|EnglishTranslated

Testing & QAjeremylongshore/claude-co...

k6-script-generator

K6 Script Generator - Auto-activating skill for Performance Testing. Triggers on: k6 script generator, k6 script generator Part of the Performance Testing skill category.

🇺🇸|EnglishTranslated

Tools & Utilitiesjesseotremblay/claude-ski...

analyzing-funding-landscape

Analyzes venture capital, investment trends, funding rounds, investor strategies, M&A activity, and funding patterns in specific markets or industries. Use when the user requests funding analysis, VC landscape research, investment trend analysis, or wants to understand investor activity and funding dynamics.

🇺🇸|EnglishTranslated

Testing & QAjeremylongshore/claude-co...

load-testing-apis

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

🇺🇸|EnglishTranslated

Product & Designgpu-cli/skills

tui-review

Critically review terminal user interfaces for UX quality, responsiveness, visual design, and interactivity. Use when asked to "review my TUI", "test my TUI UX", "audit my terminal UI", "check TUI responsiveness", "review TUI keybindings", "check interactivity", or any request to evaluate the user experience quality of a ratatui/crossterm/ncurses-based terminal application. Launches the TUI in tmux, systematically tests 10 dimensions (responsiveness, input conflicts, visual clarity, navigation, feedback loops, error states, layout, keyboard design, permission flows, visual design & color), and produces a graded report with screenshots and specific findings. Benchmarks against Claude Code, OpenCode, and Codex — the three best-in-class AI terminal UIs.

🇺🇸|EnglishTranslated

AI & Machine Learningaffaan-m/everything-claud...

agent-harness-construction

Design and optimize AI agent action spaces, tool definitions, and observation formatting for higher completion rates.

🇺🇸|EnglishTranslated

Code Qualitybytedance/agentkit-sample...

code-optimization

Optimize code performance through iterative improvements (max 2 rounds). Benchmark execution time and memory usage, compare against baseline implementations, and generate detailed optimization reports. Supports C++, Python, Java, Rust, and other languages.

🇺🇸|EnglishTranslated

AI & Machine Learningruvnet/ruflo

agent-performance-benchmarker

Agent skill for performance-benchmarker - invoke with $agent-performance-benchmarker

🇺🇸|EnglishTranslated

Tools & Utilitieslongbridge/skills

longbridge-competitive-analysis

Competitive landscape analysis — builds a competitive structure research framework covering market positioning (Porter five-forces), peer cross-comparison (PE/PB/ROE/revenue growth), market share estimation, competitive advantage assessment (moat), and potential disruptor identification. Triggers: "竞争格局", "竞争分析", "行业竞争", "市场份额", "竞争对手", "护城河", "波特五力", "竞争优势", "競爭格局", "競爭分析", "行業競爭", "市場份額", "競爭對手", "護城河", "波特五力", "competitive analysis", "competitive landscape", "market share", "competitive moat", "Porter five forces", "industry competition", "competitive advantage", "market positioning", "moat analysis", "NVDA vs AMD", "who are the competitors".

🇺🇸|EnglishTranslated