Search Results: benchmarking

Found 90 Skills

llm-evaluation

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

🇺🇸|EnglishTranslated

Backend Developmentpluginagentmarketplace/cu...

java-performance

JVM performance tuning - GC optimization, profiling, memory analysis, benchmarking

🇺🇸|EnglishTranslated

1 scripts/Checked

Code Qualityaj-geddes/useful-ai-promp...

profiling-optimization

Profile application performance, identify bottlenecks, and optimize hot paths using CPU profiling, flame graphs, and benchmarking. Use when investigating performance issues or optimizing critical code paths.

🇺🇸|EnglishTranslated

AI & Machine Learningorq-ai/assistant-plugins

compare-agents

Run cross-framework agent comparisons using evaluatorq from orqkit — compares any combination of agents (orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, Vercel AI SDK) head-to-head on the same dataset with LLM-as-a-judge scoring. Use when comparing agents, benchmarking, or wanting side-by-side evaluation. Do NOT use when comparing only orq.ai configurations with no external agents (use run-experiment instead).

🇺🇸|EnglishTranslated

Code Qualityterraphim/terraphim-skill...

rust-performance

High-performance Rust optimization. Profiling, benchmarking, SIMD, memory optimization, and zero-copy techniques. Focuses on measurable improvements with evidence-based optimization.

🇺🇸|EnglishTranslated

Code Qualityynulihao/agentskillos

performance-optimization

Apply systematic performance optimization techniques when writing or reviewing code. Use when optimizing hot paths, reducing latency, improving throughput, fixing performance regressions, or when the user mentions performance, optimization, speed, latency, throughput, profiling, or benchmarking.

🇺🇸|EnglishTranslated

Backend Developmenteduardo-sl/go-agent-skill...

go-performance-review

Detect performance anti-patterns and apply optimization techniques in Go. Covers allocations, string handling, slice/map preallocation, sync.Pool, benchmarking, and profiling with pprof. Use when checking performance, finding slow code, reducing allocations, profiling, or reviewing hot paths. Trigger examples: "check performance", "find slow code", "reduce allocations", "benchmark this", "profile", "optimize Go code". Do NOT use for concurrency correctness (use go-concurrency-review) or general code style (use go-coding-standards).

🇺🇸|EnglishTranslated

Code Qualityoutfitter-dev/agents

performance

This skill should be used when profiling code, optimizing bottlenecks, benchmarking, or when "performance", "profiling", "optimization", or "--perf" are mentioned.

🇺🇸|EnglishTranslated

AI & Machine Learningeyadsibai/ltk

nemo-evaluator

Use when evaluating LLMs, running benchmarks like MMLU/HumanEval/GSM8K, setting up evaluation pipelines, or asking about "NeMo Evaluator", "LLM benchmarking", "model evaluation", "MMLU", "HumanEval", "GSM8K", "benchmark harnesses"

🇺🇸|EnglishTranslated

Data Processingmarketcalls/openalgo-indi...

custom-indicator

Create a custom technical indicator using Numba JIT + NumPy. Generates production-grade, O(n) optimized indicator functions with charting and benchmarking.

🇺🇸|EnglishTranslated

Code Qualityabsolutelyskilled/absolut...

performance-engineering

Use this skill when profiling application performance, debugging memory leaks, optimizing latency, benchmarking code, or reducing resource consumption. Triggers on CPU profiling, memory profiling, flame graphs, garbage collection tuning, load testing, P99 latency, throughput optimization, bundle size reduction, and any task requiring performance analysis or optimization.

🇺🇸|EnglishTranslated

AI & Machine Learningerichowens/some_claude_sk...

skill-creator

Guide for creating, improving, benchmarking, and packaging Claude Agent Skills (SKILL.md files). Invoke when users want to create a skill from scratch, improve or test an existing skill, benchmark skill performance with variance analysis, or optimize a skill description for triggering accuracy. Also invoke when users say "turn this into a skill", "make a skill for X", "help me write a SKILL.md", "my skill isn't firing correctly", or want to convert a workflow/conversation into a reusable skill. Invoke proactively when a conversation has produced a repeatable workflow worth capturing. If the user mentions SKILL.md, skill files, skill descriptions, or skill triggering, this skill applies.

🇺🇸|EnglishTranslated

10 scripts/Attention